Column Types¶

In the tutorial, we stored scalar data types in our tables, like strings, numbers and timestamps. In practice, we often work with more complicated types that need to be converted to a data type our database supports.

Customising String Field Lengths¶

As we discussed in TEXT or VARCHAR, a str field type will be created as a VARCHAR, which has varying maximum-lengths depending on the database engine you are using.

For cases where you know you only need to store a certain length of text, string field maximum length can be reduced using the max_length validation argument to Field():

from typing import Optional

from sqlalchemy import Text
from sqlmodel import Field, Session, SQLModel, create_engine, select
from wonderwords import RandomWord


class Villian(SQLModel, table=True):
    id: Optional[int] = Field(default=None, primary_key=True)
    name: str = Field(index=True)
    country_code: str = Field(max_length=2)
    backstory: str = Field(sa_type=Text())

👀 Full file preview

from typing import Optional

from sqlalchemy import Text
from sqlmodel import Field, Session, SQLModel, create_engine, select
from wonderwords import RandomWord


class Villian(SQLModel, table=True):
    id: Optional[int] = Field(default=None, primary_key=True)
    name: str = Field(index=True)
    country_code: str = Field(max_length=2)
    backstory: str = Field(sa_type=Text())


sqlite_file_name = "database.db"
sqlite_url = f"sqlite:///{sqlite_file_name}"

engine = create_engine(sqlite_url, echo=True)


def create_db_and_tables():
    SQLModel.metadata.create_all(engine)


def generate_backstory(words: int) -> str:
    return " ".join(RandomWord().random_words(words, regex=r"\S+"))


def create_villains():
    villian_1 = Villian(
        name="Green Gobbler", country_code="US", backstory=generate_backstory(500)
    )
    villian_2 = Villian(
        name="Arnim Zozza", country_code="DE", backstory=generate_backstory(500)
    )
    villian_3 = Villian(
        name="Low-key", country_code="AS", backstory=generate_backstory(500)
    )

    with Session(engine) as session:
        session.add(villian_1)
        session.add(villian_2)
        session.add(villian_3)

        session.commit()


def count_words(sentence: str) -> int:
    return sentence.count(" ") + 1


def select_villians():
    with Session(engine) as session:
        statement = select(Villian).where(Villian.name == "Green Gobbler")
        results = session.exec(statement)
        villian_1 = results.one()
        print(
            "Villian 1:",
            {"name": villian_1.name, "country_code": villian_1.country_code},
            count_words(villian_1.backstory),
        )

        statement = select(Villian).where(Villian.name == "Low-key")
        results = session.exec(statement)
        villian_2 = results.one()
        print(
            "Villian 2:",
            {"name": villian_2.name, "country_code": villian_2.country_code},
            count_words(villian_1.backstory),
        )


def main():
    create_db_and_tables()
    create_villains()
    select_villians()


if __name__ == "__main__":
    main()

Warning

Database engines behave differently when you attempt to store longer text than the character length of the VARCHAR column. Notably:

SQLite does not enforce the length of a VARCHAR. It will happily store up to 500-million characters of text.
MySQL will emit a warning, but will also truncate your text to fit the size of the VARCHAR.
PostgreSQL will respond with an error code, and your query will not be executed.

However if you need to store much longer strings than VARCHAR can allow, databases provide TEXT or CLOB (character large object) column types. We can use these by specifying an SQLAlchemy column type to the field with the sa_type keyword argument:

from wonderwords import RandomWord


class Villian(SQLModel, table=True):
    id: Optional[int] = Field(default=None, primary_key=True)
    name: str = Field(index=True)
    country_code: str = Field(max_length=2)
    backstory: str = Field(sa_type=Text())


sqlite_file_name = "database.db"
sqlite_url = f"sqlite:///{sqlite_file_name}"

engine = create_engine(sqlite_url, echo=True)


def create_db_and_tables():
    SQLModel.metadata.create_all(engine)


def generate_backstory(words: int) -> str:
    return " ".join(RandomWord().random_words(words, regex=r"\S+"))


def create_villains():
    villian_1 = Villian(
        name="Green Gobbler", country_code="US", backstory=generate_backstory(500)
    )
    villian_2 = Villian(
        name="Arnim Zozza", country_code="DE", backstory=generate_backstory(500)
    )
    villian_3 = Villian(
        name="Low-key", country_code="AS", backstory=generate_backstory(500)
    )

    with Session(engine) as session:
        session.add(villian_1)
        session.add(villian_2)
        session.add(villian_3)

        session.commit()

Tip

Text also accepts a character length argument, which databases use to optimise the storage of a particular field. Some databases support TINYTEXT, SMALLTEXT, MEDIUMTEXT and LONGTEXT column types - ranging from 255 bytes to 4 gigabytes. If you know the maximum length of data, specifying it like Text(1000) will automatically select the best-suited, supported type for your database engine.

With this approach, we can use any kind of SQLAlchemy type. For example, we can store pickled objects in the database:

from datetime import UTC, datetime
from typing import TypedDict

from sqlalchemy import PickleType
from sqlmodel import Field, Session, SQLModel, create_engine, select


class ModelOutput(TypedDict):
    model_checkpoint: datetime
    score: float


class ModelResult(SQLModel, table=True):
    id: int = Field(default=..., primary_key=True)
    output: ModelOutput = Field(sa_type=PickleType())


sqlite_file_name = "database.db"
sqlite_url = f"sqlite:///{sqlite_file_name}"

engine = create_engine(sqlite_url, echo=True)


def create_db_and_tables():
    SQLModel.metadata.create_all(engine)


def create_model_results():
    checkpoint = datetime.now(tz=UTC)

    result_1 = ModelResult(
        output={
            "model_checkpoint": checkpoint,
            "score": 0.9123,
        }
    )
    result_2 = ModelResult(
        output={
            "model_checkpoint": checkpoint,
            "score": 0.1294,
        }
    )
    result_3 = ModelResult(
        output={
            "model_checkpoint": checkpoint,
            "score": 0.4821,
        }
    )

    with Session(engine) as session:
        session.add(result_1)
        session.add(result_2)
        session.add(result_3)

        session.commit()


def get_average_score():
    with Session(engine) as session:
        statement = select(ModelResult)
        result = session.exec(statement)
        model_results = result.all()

    scores = [model_result.output["score"] for model_result in model_results]

    print("Average score:", sum(scores) / len(scores))


def main():
    create_db_and_tables()
    create_model_results()
    get_average_score()


if __name__ == "__main__":
    main()

Supported Types¶

Python types are mapped to column types as so:

Python type	SQLAlchemy type	Database column types
str	String	VARCHAR
int	Integer	INTEGER
float	Float	FLOAT, REAL, DOUBLE
bool	Boolean	BOOL or TINYINT
datetime.datetime	DateTime	DATETIME, TIMESTAMP, DATE
datetime.date	Date	DATE
datetime.timedelta	Interval	INTERVAL, INT
datetime.time	Time	TIME, DATETIME
bytes	LargeBinary	BLOB, BYTEA
Decimal	Numeric	DECIMAL, FLOAT
enum.Enum	Enum	ENUM, VARCHAR
uuid.UUID	GUID	UUID, CHAR(32)

In addition, the following types are stored as VARCHAR:

ipaddress.IPv4Address
ipaddress.IPv4Network
ipaddress.IPv6Address
ipaddress.IPv6Network
pathlib.Path
pydantic.networks.IPvAnyAddress
pydantic.networks.IPvAnyInterface
pydantic.networks.IPvAnyNetwork
pydantic.EmailStr

Note that while the column types for these are VARCHAR, values are not converted to and from strings.

IP Addresses¶

IP Addresses from the Python ipaddress module are stored as text.

from datetime import UTC, datetime
from pathlib import Path
from uuid import UUID, uuid4

from pydantic import EmailStr, IPvAnyAddress
from sqlmodel import Field, Session, SQLModel, create_engine, select


class Avatar(SQLModel, table=True):
    id: UUID = Field(default_factory=uuid4, primary_key=True)
    source_ip_address: IPvAnyAddress
    upload_location: Path
    uploaded_at: datetime = Field(default=datetime.now(tz=UTC))
    author_email: EmailStr

Filesystem Paths¶

Paths to files and directories using the Python pathlib module are stored as text.

from datetime import UTC, datetime
from pathlib import Path
from uuid import UUID, uuid4

from pydantic import EmailStr, IPvAnyAddress
from sqlmodel import Field, Session, SQLModel, create_engine, select


class Avatar(SQLModel, table=True):
    id: UUID = Field(default_factory=uuid4, primary_key=True)
    source_ip_address: IPvAnyAddress
    upload_location: Path
    uploaded_at: datetime = Field(default=datetime.now(tz=UTC))
    author_email: EmailStr

Tip

The stored value of a Path is the basic string value: str(Path('../path/to/file')). If you need to store the full path ensure you call absolute() on the path before setting it in your model.

UUIDs¶

UUIDs from the Python uuid module are stored as native UUID types in supported databases (just PostgreSQL at the moment), otherwise as a CHAR(32).

from datetime import UTC, datetime
from pathlib import Path
from uuid import UUID, uuid4

from pydantic import EmailStr, IPvAnyAddress
from sqlmodel import Field, Session, SQLModel, create_engine, select


class Avatar(SQLModel, table=True):
    id: UUID = Field(default_factory=uuid4, primary_key=True)
    source_ip_address: IPvAnyAddress
    upload_location: Path
    uploaded_at: datetime = Field(default=datetime.now(tz=UTC))
    author_email: EmailStr

Email Addresses¶

Email addresses using Pydantic's EmailStr type are stored as strings.

from datetime import UTC, datetime
from pathlib import Path
from uuid import UUID, uuid4

from pydantic import EmailStr, IPvAnyAddress
from sqlmodel import Field, Session, SQLModel, create_engine, select


class Avatar(SQLModel, table=True):
    id: UUID = Field(default_factory=uuid4, primary_key=True)
    source_ip_address: IPvAnyAddress
    upload_location: Path
    uploaded_at: datetime = Field(default=datetime.now(tz=UTC))
    author_email: EmailStr

Custom Pydantic types¶

As SQLModel is built on Pydantic, you can use any custom type as long as it would work in a Pydantic model. However, if the type is not a subclass of a type from the table above, you will need to specify an SQLAlchemy type to use.