\n {% endblock %}\n\n\n```\n\n```django\n{% extends \"base.html\" %}\n\n{% block scripts %}\n {{ super() }}\n\n \n{% endblock %}\n```\n\n#### HyperScript\n\n```js\nlet {chain, identity} = require(\"ramda\")\n\nlet unnest = chain(identity) // chain is a flatMap alias\n\nfunction renderBase(ctx={}, cbs={}) {\n return html([\n head(unnest([\n (cbs.head || identity)([\n title(ctx.seotitle),\n ])\n ])),\n body(unnest([\n (cbs.scripts || identity)([\n script({src: \"base.js\"}),\n ])\n ])),\n ])\n}\n```\n\nInheritance itself is replaced by \"parent\" function application from \"child\" function.\nIn this particular case `home.html` extends `base.html` which means `renderHome` will call `renderBase`.\n\nAs `renderHome` needs access to default value `renderHome` provide we implement\nthis through a callback. We add a `cbs`, a dictionary of callbacks where every callback\nwill be called with appropriate result (which is Nunjucks `js {{ super() }}` equivalent).\n\n```js\nlet {append} = require(\"ramda\")\n\nfunction renderHome(ctx={}, cbs={}) {\n return renderBody({}, {\n scripts: super_ => append(script({src: \"home.js\"}, super_))\n })\n}\n\n// can be simplified to\nfunction renderHome(ctx={}, cbs={}) {\n return renderBody({}, {\n scripts: append(script({src: \"home.js\"}))\n })\n}\n```\n\nThe `renderHome` refactoring example is a self-speaking advertisement for currying.
\nCurried functions are boilerplate killers.\n\nYou may think it's a tiny bit complicated and I kinda agree but things like `flatMap`, `identity` etc.\nbecome a second nature with practice. It took a much longer time for me to describe all this,\nthat to write a working version of code above with some manual tests.\n\nGood to know that `Ramda` already provides the same `unnest` implementation so you don't need to\ndeclare that in your own code. After we get `Proxy`, we may go further and replace `ctx={}` and `x || identity`\nchecks with a code like this:\n\n```js\n// ES2016?\nfunction renderBase(ctx={}, cbs=DefaultDict(identity)) {\n return html([\n head(unnest([\n cbs.head([\n title(ctx.seotitle),\n ])\n ])),\n body(unnest([\n cbs.scripts([\n script({src: \"base.js\"}),\n ])\n ])),\n ])\n}\n```\n\nHere `DefaultDict` is a function returning a special dict\nwhere all values fallback to provided value (`identity` in our case) if not set.\nBut for now `Proxy` is a draft.\n\n### Layout\n\nOne thing to add. As code you get from a base template is a normal data you can apply all magic of PL\nto it. How often you wanted to remove some unnecessary script (coming from a `super` call) from one exact page,\nkeeping it in a parent template? The only way to do it in Nunjucks was manual HTML parsing.\nHTML parsing with regular expressions is brittle so I bet it was easier to remove that script from\na parent template nevertheless and repeat it in every template except *that* one. It was as boring to write\nas it was to implement it.\n\nNow you can just filter `super_.children` and remove that script. You just to be aware of\nVirtualDOM `VNode` structure and I assure you it's quite easy.\n\n### Macros\n\nMacros can be replaced with functions. They generally require less variables than layouts\nso such function will probably benefit from curryable signatures. For example\nyou'll probably want to define like `renderAlert(category, message)` rather than `renderAlert(ctx)`.\nSo you just put that functions into a different folder to emphasize their different \"personality\".\n\nLet's compare the gain we're getting here.\n\n#### Nunjucks (Jinja, Django...)\n\n```django\n{% macro renderAlert(category, message) %}\n
\n \n

\n {{ message | safe }}\n

\n
\n{% endmacro %}\n```\n\n```django\n
\n {% if alerts.error %}\n {% for message in alerts.error %}\n {{ renderAlert(\"error\", message) }}\n {% endfor %}\n {% endif %}\n\n {% if alerts.success %}\n {% for message in alerts.success %}\n {{ renderAlert(\"success\", message) }}\n {% endfor %}\n {% endif %}\n\n {% if alerts.warning %}\n {% for message in alerts.warning %}\n {{ renderAlert(\"warning\", message) }}\n {% endfor %}\n {% endif %}\n\n {% if alerts.info %}\n {% for message in alerts.info %}\n {{ renderAlert(\"info\", message) }}\n {% endfor %}\n {% endif %}\n
\n```\n\n#### HyperScript\n\n```js\nlet {decode, encode} = require(\"ent\")\n\nlet renderAlert = curry((category, message) => {\n return div(\".alert.alert-block.fade.in\", {className: `alert-${category}`}, [\n button(\".close\", {\"data-dismiss\": \"alert\", type: \"button\"}, decode(\"×\")),\n p(message | safe),\n ])\n})\n```\n\n```js\nlet renderAlerts = (alerts) => {\n return div(\".page-alerts\", [\n alerts.error && map(renderAlert(\"error\"), alerts.error),\n alerts.success && map(renderAlert(\"success\"), alerts.success),\n alerts.warning && map(renderAlert(\"warning\"), alerts.warning),\n alerts.info && map(renderAlert(\"info\"), alerts.info),\n ])\n}\n```\n\nWhew! HyperScript clearly wins this round. The code is way shorter and cleaner.\nAnother thing to notice here: we need to manually decode HTML entities in HyperScript.\n\n*Wrapping macro* call style\n\n```django\n{% call foo %}\n bar\n{% endcall %}\n```\n\ncan be substituted by `foo({content: \"bar\"})` call. Nothing special.\nThat's probably all I could that about macros.\n\n### Rendering\n\nWanna see a code of final rendering? It's offensively simple.\nFor example, for [Koa](https://github.com/koajs/koa) it may look like:\n\n```js\nlet toHTML = require(\"vdom-to-html\")\n\napp.context.render = function (pathToComponent, ctx={}) {\n let context = merge(ctx, {\n constants: constants,\n i18n: Globalize,\n session: this.session,\n alerts: this.alerts,\n })\n let vdom = require(\"shared/components/\" + pathToComponent)(context)\n return toHTML(vdom)\n}\n\napp.context.renderBody = function (pathToComponent, ctx={}) {\n this.type = \"html\"\n this.body = this.render(pathToComponent, ctx)\n return this.body\n}\n```\n\n```js\nrouter.get(\"feedback\", \"/feedback\",\n function* (next) {\n // ...\n\n this.renderBody(\"detail/page.feedback\", {form, formErrors})\n return yield* next\n }\n)\n```\n\n### Conclusion\n\nSo whether you like the final result or not, I hope it was interesting for you.\nManual HTML to HyperScript translation quickly becomes a bottleneck.\nThat's why I propose you to use a [webservice](http://html-to-hyperscript.paqmind.com).\nElm already has [a similar tool](http://mbylstra.github.io/html-to-elm/)\nwhich gave us additional inspiration.\n\n{% endraw %}\n" }, { "_file": "content/blog/programming-paradigms/index.md", "title": "Paradigms in Programming", "author": "@ivankleshnin", "publishDate": "2015-12-08T00:00:00.000Z", "template": "post", "body": "\n## Short history of programming paradigms (draft)\n\n![paradigms-evolution](./assets/paradigms-evolution.png)\n\n## Control flow paradigms\n\n### Non-structured\n\n#### Based on\n\n* [Turing Machine](https://en.wikipedia.org/wiki/Turing_machine)\n\n#### Concepts\n\n* Binary code\n* Instruction\n* Memory\n* Processor\n* Register\n\n#### Examples\n\n* ASM (1949)\n* Fortran I (1957)\n* BASIC (1964)\n\n### Structured\n\n#### Based on\n\n* [Böhm-Jacopini theorem](http://www.cs.cornell.edu/~kozen/papers/bohmjacopini.pdf)\n\n#### Concepts\n\n* [Ban GOTO statements](http://www.u.arizona.edu/~rubinson/copyright_violations/Go_To_Considered_Harmful.html)\n* Control flow statements (if, for, while...)\n* Code blocks\n* Modules\n* Flow Diagrams\n\n#### Examples\n\n* Fortran II (1958)\n* Algol (1958)\n* All newer languages\n\n## Data flow paradigms\n\n### Procedural\n\nIf you think this paradigm is over, you're wrong.\nSome new languages, like [Go](https://en.wikipedia.org/wiki/Go_(programming_language)) are a newer breed\nof this Paradigm.\n\n#### Based on\n\n* Engineering\n\n#### Concepts\n\n* Module\n* Procedure\n* Scope\n\n#### Examples\n\n* Algol (1958)\n* C (1972)\n* Go (2009)\n\n### Object-Oriented (OOP)\n\nBecame mainstream paradigm since 1990 by historical accidence. Simula was created at the time when\npeople where obsessed with physical simulations. Parallels between real-world objects and\ntheir program representations were too tempted to avoid. Simula was created as an ancestor of Algol which\nalready had Lexical scoping (closure) but people did not sight how it may be useful at that moment.\nC++ inherited Simula and Java became a \"better C++\" establishing OO approach as a standard of thinking\nfor a long period.\n\nObject-Oriented programming is Procedural in disguise, stuffed with bad ideas\nlike local state, classes, inheritance, etc.\n\n#### Based on\n\n* Engineering\n\n#### Concepts\n\n* Coupling of Data and Behavior\n* Polymorphism\n* Inheritance\n* Incapsulation\n* Interfaces\n* Message passing\n\n#### Points of interests\n\n* Abstractions\n* Dependencies\n* Patterns\n\n#### Examples\n\n* Simula (1960)\n* C++ (1983)\n* Java (1995)\n* PHP (1995)\n\n### Older Functional\n\n#### Concepts\n\n* Functional languages lacking some features (which were coined later) like enforced immutability.\n\n#### Examples\n\n* LISP (1958)\n\n### Functional\n\n#### Based on\n\n* [Lambda Calculus](https://en.wikipedia.org/wiki/Lambda_calculus)\n\n#### Concepts\n\n* Currying\n* First class functions (functions can be passed to and returned from functions)\n* Immutable data (data is immutable by default)\n* Lambda functions\n* High order functions\n* Pure functions\n\n#### Points of interest\n\n* Functional composition\n* Side effects\n* State\n\n#### Examples\n\n* ML (1973)\n* Haskell (1990)\n\n### Logic\n\n#### Based on\n\n* Math Logic\n\n#### Concepts\n\n* Predicates (premises)\n* Constraints (relations)\n\n#### Examples\n\n* Prolog\n\n### Object-Oriented + Functional\n\n#### Concepts\n\n* Borrow ideas from Functional paradigm\n\n#### Examples\n\n* Python (1991)\n* JavaScript (1995)\n* Ruby (1995)\n\n### Functional + Object-Oriented\n\n#### Concepts\n\n* Functional languages which allow OOP\n\n#### Examples\n\n* Scala (2004)\n* Clojure (2007)\n\n### Functional + Logic\n\n#### Examples\n\n* Erlang\n\n## Time flow paradigms\n\nMore commonly applied to libraries than to languages.\n\n### Interactive\n\n### Reactive\n\n#### Concepts\n\n* Promises, Tasks, Futures\n* Observer, Observable\n* Coroutines\n\n#### Examples\n\n* RxJS\n" }, { "_file": "content/blog/currying-in-lisp.md", "title": "Why LISPs have no auto-currying", "author": "@ivankleshnin", "publishDate": "2015-11-22T00:00:00.000Z", "template": "post", "body": "\n## Prerequisites\n\nTo understand this article you need to know\n\n1. Whas is [Currying](https://en.wikipedia.org/wiki/Currying)\n2. What is [Variadic function](https://en.wikipedia.org/wiki/Variadic_function)\n\n## Why Clojure was created without auto-currying?\n\nThis thought was passively floating in my head until suddenly I've got striked with an answer.\nI love ideas behind LISP but it's also clear for me that currying brings more benefits\nthan default arguments and variadic functions. Composability can't be overrated.\n\n**Currying** and **Variadic functions** can be viewed as mutually exclusive.\nWhile nothing prevents a language to have both types of functions every language\ntend to gravitate towards one or another pole. Let's say both types are not perfectly compatible\nin the interface layer.\n\nCurrying is effective when all (or most of) functions are curried.\nSo this design decicion is crucial and irreversible.\nAnd it were chosen to built Clojure with variadic functions.\n\nAs always, let's see in [StackOverflow](http://stackoverflow.com/questions/31373507/rich-hickeys-reason-for-not-auto-currying-clojure-functions).\n\nOne of the proposed answers is \"to make it Java compatible\". I would believe that but Common LISP, Scheme and Racket\nalso have no auto-currying. So compatibility may be one of the reasons, but I'm not satisfied yet.\n\nAnother (implied) answer is \"because auto-currying can be confusing in dynamically typed language\".\nThat's a strong argument. Haskell wouldn't even allow you to mess things up but in Clojure or JS\nyou can easily get lost in lambdas.\n\nMy favorite, design argument, noone mentioned, is \"because of parens\".\nLISP users adore phrases like \"Parens make no difference\" and while I'm kinda agree with what they mean,\nthe truth was opposite in this case.\n\nWe need to go down to the basic syntax level. Arithmetic operations.\nI'm going to provide examples in Haskell and Clojure but all said\nwill be valid for corresponding language families as well.\n\nIn both Haskell and Clojure `+` is a function.\nHaskell supports infix syntax: `1 + 2` which is a syntactic sugar for `(+) 1 2`.\nClojure supports only prefix calls: `(+ 1 2)`.\n\nNow how to add three numbers?\nHaskell version is `1 + 2 + 3` which desugars to `(+) ((+) 1 2) 3`.\nClojure version is `(+ 1 2 3)`.\n\nClojure obviously has no syntactic sugar and is nice-looking because of variadic `+` function.\nIf `+` would be an unary function like in Haskell, that would yield a disturbing:\n\n```clj\n((+ ((+ 1) 2)) 3)\n((+ ((+ 1) 2)) ((+ 3) 4)))\n```\n\nWhew! It happens to be less readable than a lambda calculus!\nParens seemingly do not perform as well as space in the role of function applicator...\n\nWe could reinvent a `sum` function:\n\n```clj\n(sum [1 2 3])\n```\n\nor utilize reduce\n\n```clj\n(reduce + [1 2 3])\n```\n\nBut they still require four characters instead of one.\n\nSo the current variadic approach gives us the cleanest view of arithmetic formulas possible to LISP.\nSame reasoning can be applied to logical operators etc.\n\nMacros to unfold human math syntax? They are not composable to begin with.\nYou can't just `clj (map my-macro [1 2 3])`. You need to add a lambda wrapper at least.\nA complete overkill for hello-world stuff...\n\nMacros to create immediately curried function? Well, you'll need to reimplement a great chunk of Clojure standard\nlibrary to make things \"curry-friendly\". Some core Clojure functions aslo have reversed argument order (i.e. data-first).\nI doubt you have so much time to spend.\n\nAs already stated, currying is beneficial at a large scale. Not as a micro data-flow optimization.\nTrying to fix that we're immediately starting to lose the simplicity which made LISP stand out.\n\nSo variadic API choice for LISP seems to be predetermined by a LISP syntax itself\nright from the beginning. Not a good thing to discover.\n" }, { "_file": "content/blog/primary-keys.md", "title": "Primary keys", "author": "@ivankleshnin", "publishDate": "2015-11-16T00:00:00.000Z", "template": "post", "body": "\n## Introduction\n\nPrimary keys are constant problem.\n\n1. Should it be natural or artificial?\n2. If it's artificial, what kind of them exactly?\n2. Where to generate primaries (server or client)?\n\nPeople tend to prefer simple answers like \"always use surrogate PKs\"\nor \"always use integers for PKs\" especially if they were personally beaten by a particular problem.\nUnfortunately, there is no simple answer.\n\n## PK requirements\n\nLet's see, at first, what StackOverflow [says about the subject](http://stackoverflow.com/questions/337503/whats-the-best-practice-for-primary-keys-in-tables).\n\n> 1. Primary keys should be as small as necessary. Prefer a numeric type because numeric types are stored\n in a much more compact format than character formats. This is because most primary keys will be\n foreign keys in another table as well as used in multiple indexes. The smaller your key, the smaller the index,\n the less pages in the cache you will use.\n\n> 2. Primary keys should never change. Updating a primary key should always be out of the question.\n This is because it is most likely to be used in multiple indexes and used as a foreign key.\n Updating a single primary key could cause of ripple effect of changes.\n\n> 3. Do NOT use \"your problem primary key\" as your logic model primary key.\n For example passport number, social security number, or employee contract number as these \"primary key\"\n can change for real world situations.\n\n> On surrogate vs natural key, I refer to the rules above.\n If the natural key is small and will never change it can be used as a primary key.\n If the natural key is large or likely to change I use surrogate keys.\n If there is no primary key I still make a surrogate key because experience shows\n you will always add tables to your schema and wish you'd put a primary key in place.\n\n> Answered by @Logicalmind\n\nNot a bad to start with. Point 3 here is actually an extension of point 2, because it's basically\ndescribes how things, that you imagine are constant, may change.\n\nI wouldn't say that \"Primary keys never change\" because there are always an exceptions.\nThings may come screwed to the point when you'll have to dive into your DB console or write a script to change primaries\nmanually. This will require to strive with foreign key constraints, be very accurate to not destroy\nyour data but there may be no other choice.\n\nThe advice\n\n> Prefer a numeric type because numeric types are stored in a much more compact format than character formats\n\nis very one-sided. The story behind keys is much more complicated.\n\nFrom the usability point of view (clients, db analysts) primaries should be:\n * unique\n * short & readable (memorizable)\n * secure (contain no private information)\nSomething like random (not sequential) number is the best option here.\n\nFrom the data architecture point of view (program) primaries should be:\n * globally unique (read very long)\n * easy and fast to generate\n * able to generate in parallel\nSomething like UUID is the best option here.\n\nFrom the maintenance point of view (DB) primaries should be:\n * unique\n * short as possible\n * sequential to minimize disk fragmentation\nAutoincremental integer is the best option here.\n\nMutually exclusive paragraphs detected. Being \"unique\" contradicts being \"as short as possible\".\nBeing \"sequential\" is the opposite to being \"random\". Depending on whom you ask you'll get different answers.\n\nPeople tend to be the most concerned about their direct jobs, that's why devops will assure you that\nautoincremental integers are \"obviously\" the best solution.\nThey will say \"Perfomance should be thinked upfront\" and \"Big indexes kills performance\",\nsimply ignoring that additional code *someone else* will be required to write and support.\n\nProgrammers will recommend \"obvious\" UUID because it's so random, does not expose row quantity and fashionable.\n\"Disk space is cheap\" they say.\n\nManagers may also invade this party with some unexpected wishes.\n\n## PK investigations\n\nLet's explore this further. Numeric types have several weaknesses. They make your business data leak to the outer world.\nYour competitors should never know how many purchases your e-store has already made and makes per day.\n\nBut when you expose a link like\n`http://my-cool-estore.com/orders/35`\nyoure going naked.\n\nI met opinions that you shouldn't use PK in URL because it's \"insecure\".\nThis seems a nonsense to me because the whole point of PK is to represent your domain model.\nEverywhere. Including other systems. The complexity of architecture depends greatly on this.\n\nIf you provide access to your model through other field (or fields) you need to ensure that it's\nunique, does not change over time and apply almost every other criteria you've applied to PK before.\n\nThis spoils the whole picture in practice because you have non-PK that should behave like one,\nthat requires additional index, making your tables bigger, your dataflow is splitted to\nid and non-id based strategies, and so on. And all this to satisfy some crippled vision of security.\n\nLet's start with AUTOINC approach.\nNumeric keys have problems. They stored as numbers but used as strings (you never add or multiply PKs, right?).\nSo you'll have to coerce between types in some places (urls, maybe database, etc.).\nThis is boresome and error-prone. Numbers also naturally have an overflow limit pushing you to choose bigger ranges upfront.\nSome of the platforms may have surprisingly little maximum number.\nFor example, EcmaScript's `Number.MAX_SAFE_INTEGER` is only `9007199254740991` which may be a deal breaker for big data.\n\nWhen it's come to distributed systems, AUTOINC keys fail shortly because you can't generate them\nin parallel. This is also important because you may want to generate ids on the client.\nIn case of custom implementations it's easy to find yourself in a situation where you\nneed to rely on [uncertain community projects](https://github.com/justaprogrammer/ObjectId.js)\nbecause DB authors simply \"forgot\" that part of the deal.\n\nSo we need to come with really unique string identifier which does not expose you business stuff.\nWhy not just take something really random like [UUIDs](https://en.wikipedia.org/wiki/Universally_unique_identifier)\nas a standard solution.\n\nSome new databases like RethinkDB did exactly that and chose to use standard UUID to represent\ntheir default surrogate keys. Some others like MongoDB came with their own [approximations](https://docs.mongodb.org/manual/reference/object-id/).\nUUID seems to be the best choice because basically every language has libraries\nto generate UUIDs. Being crossplatform and standard matters. Why bother with something else?\n\nBecause implementation matters. UUID v4 provides very long and incosistent sequences. This means\nthat you tables [will become big and slow](http://stackoverflow.com/questions/11938044/what-are-the-best-practices-for-using-a-guid-as-a-primary-key-specifically-rega).\n\n\"Never use UUID\" is not the answer as well because if you're going to index that field\nnevertheless, and access by that index most of the time, this is a changing factor and\nyou may win nothing by adding surrogate key. A lot depends of your DB of choice and it's storage strategy.\n\nWhat about natural vs surrogate primaries. Surrogate keys are obviously better when you\ngenerate a lot of models. Primaries have to be generated because you generally\ncan't grant PK selection to occasional visitors and you don't want to bother them with technical details.\n\nPeople are inclined to thought patterns.\nIn CMS, it may be a good idea to index your \"pages\" or \"documents\" by their local URLs.\n\n```\n== ID == | == CONTENT ==\n/about-us | ...\n/contacts | ...\n```\n\nMost of CMSs just add another field named \"alias\" or \"slug\" to bind URL and data.\nThey argue that you may want to change URL. In general case, though, ULR change is complicated.\nYou need to add redirection rules for search spiders at least and this rules are decoupled from\nyour database, making things messy.\n\nSuch \"slug\" field duplicates PK's purpose and require indexing so you may think\nyou save space and performance by adding surrogate PK while in reality you don't.\n\nThe number of such \"pages\" in a typical website won't come close even to 10K.\nEven if you write an article per day, 3k articles will be made in a 10 years...\n\nSo why CMS's are projected like they are something different? Because people always try to simplify things.\nTheir heard of \"always-use-surrogate-keys\" mantra and follow it.\n\nSlugs are hacks over the data architecture, but in the end,\nit's always a question of resources, usage patterns and audience.\nIf you are able to lurk into DB console for occasional *hacky actions*,\nyou may remove that permanent *hacky code* in your repo and vice-versa.\n\nThe underline is that PK choice and generation strategies are much more complex that it seems from the first sight.\nMoreover, it turns out to be among the primary scaling concerns.\nPeople invent things like [Twitter Snowflake](https://github.com/twitter/snowflake) for a reason.\n\n## Natural vs Artificial keys\n\n##### Artificial Numeric\n\n\\+ fastest read / write raw performance
\n\\+ fastest sorting
\n\\+ fastest machine maintenance operations (replication, deduplication etc.)
\n\\+ minimum fragmentation
\n\\- lead to business data leaks
\n\n##### Artificial Unique String (Random)\n\n\\* something between \"Artificial Numeric\" and \"Natural\" depending on length
\n\\- cause fragmentation
\n\n##### Artificial Unique String (Sequentially Random)\n\n\\* like Artificial Unique String with less fragmentation
\n\\- harder to implement
\n\n##### Natural\n\n\\+ minimum number of joins / quieries
\n\\+ minimum db size in some cases
\n\\+ easiest maintenance in some cases
\n\n#### Natural Composite\n\n\\* like \"Natural\" but more cumbersome
\n\\+ the best choice for M-to-N tables
\n\nA word about \"Natural Composite\". When people define ID for M-to-N relational table (which is required\nby many frameworks and libraries...) they reluctantly imply there can be more than one M-to-N relations possible.\nFor example it's possible to associate tag with image twice despite it conveys no meaning:\n\n```\nid (PK) | image_id | tag_id\n 1 | 1 | 1\n 2 | 1 | 2\n 3 | 1 | 1 -- ^_^\n```\n\nunless additional constraint is casted on `image_id` + `tag_id` pair. But it would be much better\nto make `image_id + tag_id` a composite primary and a) get rid of additional meaningless column\nb) get that constraint for free c) get logic in case of additional columns which will depend straight on PK\nand not on ID agent which has no meaning at all.\n\n```\nimage_id (PK) | tag_id (PK)\n 1 | 1\n 1 | 2\n```\n\nM-to-N gains meaning only with joins. If you will never access some data by PK – don't add that PK.\nIf your library insists – reconsider the library choice. The mantra \"every table must have an ID column\" goes solely\nfrom crappy object-oriented ORM's which choose to \"not support\" composite keys for the sake of \"simplicity\".\n\nRead [this](http://sqlblog.com/blogs/aaron_bertrand/archive/2010/02/08/bad-habits-to-kick-putting-an-identity-column-on-every-table.aspx)\nand [this](http://weblogs.sqlteam.com/jeffs/archive/2007/08/23/composite_primary_keys.aspx) and let's\nfinish with it.\n\n## Frontend vs backend for id generation\n\nLet's put off enterprise stuff for now. Imagine we have a reasonably small table and want UUID PK.\nWhere to generate ids?\n\nMany people does not even raise this question because they're bound by mainstream paradigms,\nwhere AUTOINC keys used to be \"the only\" choice. Modern web dev landscape still is poisoned\nwith \"Active Record\" and \"ORM\" centered frameworks which enforce user to \"be predictable\" and \"keep defaults\".\n\nWhen you choose Django or similar, the most interesting choices are made for you.\nAnd you take them as \"natural\" or \"obvious\" when it's only a slice of possibilities.\n\nYou need to start typing to see more flaws behind this.\nI mean typing with types, not with a keyboard.\n\nAssume we generate primaries in backend.\nWhat type do your models have? Let's imaging something simple:\n\n```js\nlet Tc = require(\"tcomb\")\n\nlet Robot = Tc.struct({\n id: Tc.String,\n about: Tc.String,\n create_date: Tc.Date,\n})\n\nlet robot1 = Robot({\n id: \"f47ac10b-58cc-4372-a567-0e02b2c3d479\",\n about: \"I'm an awesome robot\",\n create_date: new Date(),\n})\n```\n\nBut if you create your models in frontend and push them to API (as you should do) this is not correct.\nBecause you create your models *before* you've got an id.\nYou're about to send this model to server, but you can't even create an object.\n\nOne of the choices in this particular case is to send raw untyped data but this will spoil your architecture.\nWhy \"create\" and \"edit\" actions should be so different?\n\nYou can also describe both types explicitly:\n\n ```js\nlet Tc = require(\"tcomb\")\n\nlet AlmostRobot = Tc.struct({\n about: Tc.String,\n create_date: Tc.Date,\n})\n\nlet Robot = AlmostRobot.extend({\n id: Tc.String, // or Uid\n})\n\nlet robotToCreate = AlmostRobot({\n about: \"I'm an awesome robot\",\n create_date: new Date(),\n})\n```\n\nTwo types are required for every model you allow to create on the frontend.\nA lot of code to put and support and seemingly out of a thin air...\n\nAlso keep in mind that models without ids are \"broken\" from many points of view.\nYou can't store them in global dict where keys are PK, for example.\n\nOne way or another, the conclusion that frontend generation for primaries is better choice (from the purist point of view)\nasks for itself. In backend both `POST /robots/` and `PUT /robot/:id` endpoints should be allowed and supported.\n\nSo we can outline the whole picture now. I'll keep my recomendations draft\nbecause I believe there is still much more to say and evaluate.\n\n## Draft recommendations\n\n### Very small data that rarely changes\n\nSuch tables tend to be required everywhere. Read performance is much more critical comparing to write.\nChoose natural meaningful PKs whenever possible to reduce the number of joins or subqueries required.\n\n### Small data, predictable requirements.\n\nChoose natural PK whenever possible. Data access and analysis will be easier. You won't duplicate things.\n\nPhone number for `phones` table may be a very good PK because it's unique and never changes.\nIt's not too long and not too sparse (only digits).\n\nKeep in mind the privacy concern.\nEven if you forbid to change email, it is not a good PK for `users` table because\nit exposes private data.\n\nPay attention that phone and email cases are different.\nIf you're going social, you will expose `/users/john.doe@mail.com` URL and this is a business data leak.\nPhone URLs will likely be accessible for stuff only.\n\nIf you, hovewer, going to publish URLs like `/phones/555-555-5555` you better choose surrogate PK instead.\n\nIf you're not sure how your data will change and requirements will evolve – choose surrogate PK.\n\nAgain, some people will argue that \"nothing can be predicted\" and advice to \"always use identity PK\"\nbecause of that. For me personally it's easier to overcome rare cases when I was wrong than\nto puzzle things right from the beginning.\n\n### Unpredictable requirements\n\nChoose time-proven surrogate PK. Stick with that PK type your DB is best tuned for.\nDo some investigations uprfont. You may find a lot of information to surprise yourself.\nGenerate your PK in if it's possible.\n\n### Big data\n\nChoose what will give the required performance and scalability.\nThe big data solutions tend to be messy and pragmatic rather than pure and idealistic.\n" } ], "breadcrumbs": [ [ "/", "Paqmind α" ], [ "/blog/", "Blog" ] ] } }, "testimonials": [], "_loading": { "root/blog/": 0 } }