Tabular format

12/29/2023

So you can have both hierarchical and multivalued data in a row. Json has actual types (int, float, strings, boolean).

If you need multiple values, you can use arrays for the fields. With ndjson, every line is a json object. People just shoehorn all sorts of stuff into spreadsheets because they are there. An alternate interpretation of that is that that data has some kind of hierarchy and isn't really row based. I've also seen business people use empty cells to indicate hierarchical 'inheritance". People love spread sheets as editing tools but they then end up doing things like putting comma separated values in a cell. Tabular formats break down when the data stops being tabular. Ndjson is actually a really pragmatic choice here that should not be overlooked. That needs to be in-sync with the version of the application deployed, so it makes sense to manage that along with the code. Our "static" data that would only change when there is an application change is managed through csv files managed by liquibase that apply on any change.

It allows you to deal with conflicts between branches with your standard merge tools. We found it better to have our views stored as individual files in liquibase, and have them apply on change, because of nested dependencies and other issues with not having a good source of truth for the view definition.įunctions/procedures were another that are best treated as code rather than migrations. There are a number of things that are not managed by migration script, and are instead apply-on-change in our project though. Using database dumps without any data as our baseline.Īny DDL changes are managed through migration scripts. So this is essentially what we are doing with liquibase. > If you want a golden copy to bootstrap new environments, I would argue you are better off backing up that golden copy and restoring it using native database tools That allows your standard backup/restore procedures to work just fine. That way when you want to update your database with the latest migrations, it'll only run what hasn't been applied yet. Your your "mostly static" data is managed by running inserts/updates/deletes when data changes (either manually, or the tool can usually generate them for you), when you actually apply it, the software also records that the migration has been applied in a database change log table. I think this is simply a misunderstanding of how these DB change management tools work and what they can do to help you with complex migrations over an applications lifetime. > If you put only a portion of it in source control, how do you back up the rest of your database? Then because tech books need to be thick and a 300 page book is not big enough a bunch of stuff that never amounted to anything, like Xlink or some breathless stuff about some XML formats, maybe a talk about SVG and VML, XSL-FO blah blah blah. Another 60 pages? (and then when that validation language came it sucked, as I noted elsewhere)ĥ. Then validation and entities in DTDs noting that this was old stuff from SGML days and you didn't need it and there was going to be some other way to validate really soon. I would say this would be another 100 - 150 pages, so a query language and a programming DSL to manipulating the data/document format. Then would come the secondary stuff to understand, XPath and XSLT. Which was what you needed to understand the basis of XML.ģ. These XML books tended to have section on XML and well formedness, namespaces, UTF-8, examples of designing a format - generally a book or address format - all this stuff probably came in to approximately 80-115 pages. Although I have to admit that Henry Thompson is a great programmer, and one of the most worthwhile speakers on technical issues I have ever had the pleasure to here, and while his model of XSD validation as a finite state machine is also elegant it still does not make it suck any less because the standard could not validate many common markup structures.Ģ. But there is no real way to represent this kind of context dependent structure in the first version of XSD (I have not kept up for reasons that shall become clear). This structure is of course easy to represent with XPath, probably via Schematron. These attributes were a comma separated list of ids. The violations part would have attributes saying what people had taken part in the violation, and who had witnessed. In one part of the document would be a list of people committing violations, another part a list of people witnessing violations, and then in another part violations. The first time I used XSD was in 2001 I think, for a format we were developing to do human rights violation reporting.

0 Comments

Tabular format

Leave a Reply.

Author

Archives

Categories