I want to build something with a nosql back end but with foreign keys supported.
As far as I know, cosmos and mongo do not support foreign keys. I realize that having foreign keys imposes a little bit of schema on my data but I think this is a good thing. So I have come up with a hybrid approach of storing data in a traditional database as json, but using calculated columns to support foreign keys etc.
At the moment this is scarcely more than an idea, and to demonstrate see the following code.
type KeyFunction =
|Simple
|Unique
// |Foreign of string ; not yet supported
type KeyName = KeyName of string with member this.AsString = match this with KeyName s -> s
type KeyDef<'a> = {
KeyPath:string
GetKeyValue:'a -> int // don't actualy have a use for this yet
KeyFunction: KeyFunction
}
[<AutoOpen>]
module Demo =
open MySqlConnector
let getCreateTableSql tableName keys =
let keySql (keyName:KeyName, keyDef) =
let keyFunctionSql =
match keyDef.KeyFunction with
|Simple -> sprintf "key (%s)" keyName.AsString
|Unique -> sprintf "uinque key (%s)" keyName.AsString
[
sprintf "%s int as (json_value(json_info, '%s.%s')) stored" keyName.AsString keyDef.KeyPath keyName.AsString
sprintf "check (%s is not null)" keyName.AsString
keyFunctionSql
]
let keysSql = keys |> Map.toList |> List.collect keySql
let createClauses =
[
"entity_id int not null auto_increment primary key" //automatically assigned by DB
"last_update timestamp not null" //automatically assigned by DB
"version_number int not null default 1" //gets incremented by a trigger
"json_info json not null"
"check (json_valid(json_info))"
yield! keysSql
] |> String.concat "," |> sprintf "create or replace table %s(%s)" tableName
let triggerSql =
[
"create or replace trigger main_table_versioning before update on main_table"
"for each row"
"set new.version_number = old.version_number + 1"
] |> String.concat " "
sprintf "%s;%s;" createClauses triggerSql
let createTable (conn:MySqlConnection) tableName keys =
let sql = getCreateTableSql tableName keys
use cmd = new MySqlCommand(sql, conn)
cmd.ExecuteNonQuery() |> ignore
The idea is that each table has some standard columns to manage entity ids and versioning etc, and the data is stored as json.
What I need help with is, I am trying to figure out a way to make this “nice-to-use”.
It would be nice if the user of the library could simply work with types and define relationships between types. The user does not need to specify any magic strings such as the json path for keys. All of this could be figured out by some combination of reflection, generics and SRTPs.
The library would translate all that into DB land. The user then works with the types and going from some record to a record reference by a foreign key could involve some sort of optics such as isomorphisms.
Perhaps eventually this could be a type provider.
Any advice on ergonomics and making it “nice-to-use” would be much appreciated.
type EntityId<'a> = EntityId of int
with
member this.AsInt = match this with EntityId i -> i
type Entity<'a> = {
EntityId:EntityId<'a>
LastUpdate:DateTime
VersionNumber:int
Item:'a
}
type EntityMap<'a> = EntityMap of Map<EntityId<'a>, Entity<'a>>
with
member private this.Map = match this with EntityMap m -> m
member this.Find id =
match this.Map.TryFind id with
|Some x -> Ok x
|None -> Error "key does not exist in entity collection"
type SomeRefableType = {
Field1:string
Field2:string
}
type SomeRefingType = {
Field1:DateTime
Filed2:TimeSpan
RefField:EntityId<SomeRefableType>
}
and some demo usage
let testMap : EntityMap<SomeRefableType> = Map.empty |> EntityMap
let testList : Entity<SomeRefingType> list = []
let x =
testList
|> List.map (fun elem -> testMap.Find elem.Item.RefField)
I feel like there is a lot of room for improvement. I don’t like that following a foreign key reference returns a Result<>. The fact that there is a foreign key on a non null value guarantees that there is a value to reference. The library should take advantage of this and return a plain value when it knows that all values from the foreign table should have been picked up.
This is by no means a direction I think is worth pursuing, just highlighting the type of usage it should cater for.
type SearchKey<'a> = {
KeyName:string
KeyPath:string
IsUnique:bool
}
type ForeignKey<'a> = TBD //how to parameterise the referenced collection?
type EntityId<'a> = EntityId of int with member this.AsInt = match this with EntityId i -> i
type Entity<'a> = {
EntityId:EntityId<'a>
LastUpdate:DateTime
VersionNumber:int
Item:'a
}
type Collection<'a> = TBD
type SearchOp = //everything csan be a string (or even an obj). But how can heterogenous types be handled?
|Gt of string // >
|Lt of string // <
|In of string list // in (...)
|Etc
type SearchTerm<'a> = SearchKey<'a> * SearchOp
//basic crud operations
let getEntity<'a> : Collection<'a> -> EntityId<'a> -> Option<Entity<'a>> = raise(NotImplementedException())
let getEntities<'a> : Collection<'a> -> SearchTerm<'a> list -> Entity<'a> list = raise(NotImplementedException())
let updateEntity<'a> : Collection<'a> -> Entity<'a> -> Entity<'a> = raise(NotImplementedException())
let updateEntities<'a> : Collection<'a> -> Entity<'a> -> Entity<'a> = raise(NotImplementedException())
let insertEntity<'a> : Collection<'a> -> 'a -> Entity<'a> = raise(NotImplementedException())
let insertEntities<'a> : Collection<'a> -> 'a list -> Entity<'a> list = raise(NotImplementedException())
let deleteEntity<'a> : Collection<'a> -> EntityId<'a> -> unit = raise(NotImplementedException())
let deleteEntities<'a> : Collection<'a> -> EntityId<'a> list -> unit = raise(NotImplementedException())
let createCollection<'a> : SearchKey<'a> list -> ForeignKey<'a> list -> Collection<'a> = raise(NotImplementedException())
let getCollection<'a> : unit -> Collection<'a> = raise(NotImplementedException())
Finding a way to include strongly types foreign keys is the only tricky part I think.
The simple answer is that there is no simple answer to these sort of problems, the only way to find out what works for your scenario is to invest R&D time into it.
The question is hard to answer because the performance requirements aren’t spelled out by the OP. It appears to be 75M/year records over a number of customers with a write rate of num_customers*1minute (which is low), but I don’t have figures for the required read / query performance.
Have you done any IDictionary related work? Looking into “controller” related programming eg. remapping keys, joystick remapping, other device plugin related remapping may offer a little insight for generating the isomorphisms you’re talking about. NoSQL is good because it is comparably a lightweight database form verses the familiar SQL standard.