Data Design

When designing an application feature, much consideration is given towards presentation, usability, content, and accessibility; but data structures are relegated to implementation details.

Relational Model

In most applications, regardless of programming language, a traditional database is used to store persistent system- and user-based information. What makes this database “traditional” is its use of independent relations (or tables) of related information. Each table consists of a set of tuples (or rows) comprised of attributes (or columns); with each tuple uniquely identified by a key, allowing links between relations to be defined.

Object-relational mapping (i.e. ORM) is a strategy to map relations directly to programming objects. Each row becomes a logical model of database attributes, identified by the relation’s unique key. Links between relations are maintained, allowing for data relationships to be traversed between models. These relationships are the fundamental structure for data design.

Many programming frameworks come bundled with an ORM, e.g. Django, Laravel, Ruby on Rails; while other libraries such as SQLAlchemy and Sequelize can be integrated as desired.

Entity Relationships

A database row (or tuple) can be thought of as an independent entity logically defined by its specific attributes. This logical entity is what is mapped to an object by the ORM. Relationships between entities are described most commonly by one of three entity-relationship models: one-to-one, one-to-many, and many-to-many. These cardinality-based models refer to the number of relationships existing between rows (or entities) in one table to those in another.

To help discus these models, consider an application which lists baseball players along with their current batting information. A data model for players could list biographical data, i.e. team, position, height, and weight. The model would be defined programmatically according to the ORM, but some examples are shown.

from django.db import models

class Player(models.Model):
    name = models.CharField(max_length=255)
    team = models.CharField(max_length=255)
    position = models.CharField(max_legth=255)
    number = models.PositiveSmallIntegerField()
const { Sequelize, DataTypes } = require('sequelize');

const Player = sequelize.define('Player', {
  name: { type: DataTypes.STRING, allowNull: false },
  team: { type: DataTypes.STRING },
  position: { type: DataTypes.STRING },
  number: { type: DataTypes.INTEGER },
});

One-to-one

A one-to-one relationship is defined by two entities A and B having exactly one relationship between them. It can be said that “A has one B”, and vice-versa. This is modeled in the database by one of the tables storing the other’s primary key (PK) as a related attribute, referred to as a foreign key (FK).

Along with our Player model above, a second data model could list batting records, including strike outs, hits, bases gained, and scoring runs totaled. This model would need to have a foreign key to relate each record back to a single player. Batting records would have one (i.e. belong to a) player record, and players could have one (or possibly no) batting record.

class Player < ApplicationRecord
  has_one :batter
end

class Batter < ApplicationRecord
  belongs_to :player
end
use Illuminate\Database\Eloquent\Model;

class Player extends Model 
{
    public function batter()
    {
        return $this->hasOne(Batter::class);
    }
}

The benefits of a one-to-one relationship are not always apparent. After all, the batting records could be attributes of the player relation itself. Consider , however, players who might not have any batting statistics, such as relief pitchers. The one-to-one relationship prevents this by “extending” the player table when appropriate.

One-to-many

The most common relationship is that of one-to-many, whereby an entity in table A is related to many entities in table B. This can also be expressed as “A has many B”. As with the one-to-one model above, the inverse relationship from B to A is described as “B belongs to A”.

Continuing with our example baseball app, we can now expand our model such that each player is associated with many batting records, each of which comprise statistics from a specific game. Our app can now calculate player statistics using aggregate sums of filtered batting records, i.e. by most recent count.

ORM examples for these definitions follow.

from sqlalchemy import Column, ForeignKey, Integer, Table
from sqlalchemy.orm import declarative_base, relationship

Base = declarative_base()

class Player(Base):
    __tablename__ = "players"
    id = Column(Integer, primary_key=True)
    batters = relationship("Batter")

class Batter(Base):
    __tablename__ = "batters"
    id = Column(Integer, primary_key=True)
    player_id = Column(Integer, ForeignKey("player.id"))
use Illuminate\Database\Eloquent\Model;

class Player extends Model
{
    public function games()
    {
        return $this->hasMany(Batter::class);
    }
}

class Batter extends Model
{
    public function player()
    {
        return $this->belongsTo(Player::class);
    }
}

Many-to-many

The final entity-relationship model to discuss involves many-to-many relationships: an entity in table A can have many entities in table B, and an entity in table B can have many entities in table A. Sometimes this will be referred to as “A has and belongs to many B”. Unlike our previous relations, we can no longer use a single foreign key in one of the associated models to identify our relationships. Instead, an intermediary (or association) table is needed, where each row is comprised of foreign keys identifying each of the related models.

The final expansion of our baseball app will add games to our logical models. Players and games will comprise a many-to-many relationship whereby each player will play in many games, and each game will have many players.

With an ORM, the association table is often hidden behind the many-to-many implementation. In the following example, three tables would be generated by Django, one each for the Player and Game models, as well as an association players-games table with one foreign key each for unique combinations of player and game:

from django.db import models

class Game(models.Model):
    # ...
    players = models.ManyToManyField(Player)

This hidden association table, however, is not difficult to build on our own, and would offer many additional benefits beyond a simple ORM association. For example, consider using our batters model as the intermediary between players and games. By adding a foreign key to games, our batters model now can be used to calculate player statistics using game filters, i.e. opponent or date.

const Player = sequelize.define('Player', { /* ... */ });
const Game = sequelize.define('Game', { /* ... */ });
const Batter = sequelize.define('Batter', { /* ... */ });

Player.belongsToMany(Game, { through: Batter });
Game.belongsToMany(Player, { through: Batter });

Most ORM associations can be extended with custom tables, but vary in complexity and usage. Ultimately, however, they’re all just syntactic sugar to access distant models using custom intermediary models as association tables.

Conclusion

With this review of entity-relationship models, hopefully you can see how upfront data design can reduce programming complexity while increasing application capabilities. ORM features attempting to hide or obfuscate the details can be quite helpful to get up-and-running, but I would encourage taking the time to experiment without these short-cuts and increase your understanding of the models themselves.

The JBS Quick Launch Lab

Free Qualified Assessment

Quantify what it will take to implement your next big idea!

Our assessment session will deliver tangible timelines, costs, high-level requirements, and recommend architectures that will work best. Let JBS prove to you and your team why over 24 years of experience matters.

Get Your Assessment