Tweag

JSON Schemas to Nickel contracts

19 October 2023 — by Taeer Bar-Yam

At Tweag we have been cooking up a JSON Schema to Nickel contract converter, that we’re excited to announce!

Background

Nickel is a configuration language being developed at Tweag. You can get some deep dives into its design from previous blog posts. I’ll summarize it here as JSON, plus functions, plus types and contracts. One of its main use-cases is generating JSON configurations for other programs (Terraform, GitHub actions, etc). Functions allow you to generate large configurations without repeating information; types and contracts help detect problems early when the configuration is malformed and determine where the problem is.

In the JSON world, the role of contracts is served by JSON Schemas, which likewise specify the form that some JSON-encoded data should have. It’s very natural, then, to have some way of importing these JSON Schema specifications into Nickel as a contract.

Use

In the simplest case, Nickel can now serve as another JSON Schema validator. After converting the schema with json-schema-to-nickel foo.schema.json > foo.schema.ncl, we can write

let Foo = import "foo.schema.ncl" in
let foo = import "foo.json" in
foo | Foo

We can also apply the contract to values defined in the Nickel code, and this gets us quite a lot. If we just run a schema validator on the generated output, the error messages can only refer to that output, and we have to figure out what Nickel code it’s referring to. A contract, on the other hand, can point to places in the Nickel code directly. The Nickel language server can also use the contract information to tell us the expected types of fields or warn us immediately when something is set wrong, right in the editor.

In practice, many of these things are still being developed, and the functionality works much better with certain kinds of JSON Schemas than others (see the next section). As Nickel, the language server, and json-schema-to-nickel mature, having schemas imported as Nickel contracts will be more and more useful.

Simple example

The following is a simple JSON Schema, defining an object with a boolean, an integer, and an optional enum, and an integer.

{
  "$schema": "http://json-schema.org/draft-07/schema#",
  "title": "MyStruct",
  "type": "object",
  "required": ["my_bool", "my_int"],
  "properties": {
    "my_bool": {
      "type": "boolean"
    },
    "my_int": {
      "type": "integer",
      "format": "int32"
    },
    "my_enum": {
      "type": "string",
      "enum": ["foo", "bar", "baz"]
    }
  }
}

And json-schema-to-nickel produces essentially the following equivalent contract (modulo some preamble).

{
  my_bool | Bool,
  my_enum | std.enum.TagOrString
          | [| 'baz, 'bar, 'foo |]
          | optional,
  my_int | std.number.Integer,
  ..
}

TagOrString allows the user to define the value either in the more natural nickel way as e.g. 'foo, or as the more JSON-like "foo".

In another Nickel file, we can write

let Schema = import "./schema.ncl" in
{
  my_bool = true,
  my_int = 5,
  my_enum = 'foo,
} | Schema

While writing it, things like tab completion of field names will work. We also get better error reporting.

Consider a slightly more complicated but erronious test file, where my_bool is set to an integer, rather than a bool, but this fact is slightly obscured:

let Schema = import "./schema.ncl" in
let foo = "bool" in
let bar = "int" in
let f = fun x => x + 3 in
{
  "my_%{foo}" = f 2,
  "my_%{bar}" = f 2,
  my_enum = 'foo,
}

When I generate the exported JSON, and try to test it against the schema with check-jsonschema, I get this less-than-useful error:

Schema validation errors were encountered.
  test.json::$.my_bool: 5 is not of type 'boolean'

From a quick glance at our source file we can’t tell where this error originates. Neither my_bool nor 5 actually appear anywhere. The situation could easily be even worse, if foo, bar, and f were defined in another source file, or if the final record were distributed across multiple source files, we wouldn’t know where to begin looking.

If we append | Schema to our Nickel source, we get an error when trying to generate the JSON.

error: contract broken by the value of `my_bool`
    ┌─ /path/to/schema.ncl:817:13
    │
817 │   my_bool | Bool,
    │             ---- expected type
    │
    ┌─ /path/to/test.ncl:6:17
    │
  4 │ let f = fun x => x + 3 in
    │                  ----- evaluated to this expression
  5 │ {
  6 │   "my_%{foo}" = f 2,
    │                 ^^^ applied to this expression
    │
    ┌─ <unknown> (generated by evaluation):1:1
    │
  1 │ 5
    │ - evaluated to this value

Here we see all of the expressions relevant in the code (except perhaps the definition of foo). We can see not only the fully evaluated value, but where in the code this is calculated and assigned.

This simple example schema produced a clean Nickel contract as output. json-schema-to-nickel can handle much more—nested objects, definitions, union types, etc.—but there’s a cost to complexity.

Union contracts are still hard, actually

The main technical / design challenge of json-schema-to-nickel is that JSON Schemas have union types, but as we’ve discussed before on this blog, union contracts are hard, and always come with tradeoffs in a lazy language like Nickel.

In Nickel, terms are only evaluated on-demand. If a variable never gets used to produce the output of the program, its value never needs to get evaluated. This can massively speed things up, and means that an error will only be generated if there’s actually a problem that would affect the output. In accordance with this, contracts are (typically) lazy as well. The following example evaluates fine, because only a is accessed, b | String is never forced.

let foo = {
  a = "Hello, ",
  b = 4,
} | {
  a | String,
  b | String,
} in
foo.a

This is convenient in some ways, but requires compromises when trying to define union contracts (where a value has to satisfy one of a number of conditions).

JSON Schemas are meant to operate on static values, so they have no need to be lazy. One simple solution is to match those semantics in Nickel, and strictly apply contracts that come from JSON Schemas, which means strictly evaluating the values they apply to.

In some sense, that’s what we do in json-schema-to-nickel. But in practice, we can do somewhat better. When a JSON Schema can be represented as a lazy contract, using built-in record, enum, and other contracts, we do that, as seen in the simple example above. These are also understood better by the language server, and produce more specific error messages. But as soon as we hit a union, it and all of the contracts it’s comprised of are evaluated strictly.

For instance

{
  "$schema": "http://json-schema.org/draft-07/schema#",
  "title": "NullableInt",
  "anyOf": [
    {
      "type": "integer"
    },
    {
      "type": "null"
    }
  ]
}

becomes

let predicates = ... in
predicates.contract_from_predicate
(predicates.anyOf [ predicates.isType 'Integer, predicates.isType 'Null ])

The key observation here is that, for now, this contract is just a function that gets applied to its argument. It’s completely opaque to the LSP. And because the function is strict, if it were large and only part of it were needed (imagine a JSON Schema for nixpkgs, where only one package is being selected in the end) it could slow things down considerably.

This doesn’t necessarily need to be the case. We can refine json-schema-to-nickel to represent more JSON Schemas lazily (in the above example we can check if something is null without fully evaluating it). Though we will never be able to cover all cases; sometimes we will have to resort to strict contracts.

Another approach that could yield some benefits even for strict contracts is to add to Nickel more ways to specify information relevant to the language server. Right now it infers field completions from record contracts, but there could also be an annotation that directly specifies valid field completions. json-schema-to-nickel could generate that information from the JSON Schemas alongside the opaque contracts.

Conclusion

There’s now a tool for converting JSON Schemas to native Nickel contracts. It’s always possible to check with a schema verifier after exporting a Nickel configuration to JSON, but json-schema-to-nickel opens the door to more information available earlier on while writing your configurations.

Feedback

The impetus for this project came from community interest. So please, continue to let us know how you’re using it, and what features would be useful. The issue tracker is a good place for issues, and we hang out on the Nickel Matrix channel for questions and comments.

About the authors
Taeer Bar-YamTaeer is a software engineer with a strong background in math, theory, and abstraction. He is comfortable in a range of paradigms, from low-level C networking code to high-level Haskell abstractions. He likes to tinker and solve problems creatively, and he takes great joy in refactoring and finding just the right conceptual model. He cares about effective teamwork and building strong relationships.

If you enjoyed this article, you might be interested in joining the Tweag team.

This article is licensed under a Creative Commons Attribution 4.0 International license.

Company

AboutOpen SourceCareersContact Us

Connect with us

© 2024 Modus Create, LLC

Privacy PolicySitemap