To simplify data management, we keep the "master" versions of the meta data for VEP sub-corpora in
a GitHub repository. This allows us to have a single definitive "master" version, and to track
changes to it. The master spreadsheet purposefully does not have information that would be redundant
with other sources (for example, the information stored in the TCP master data file).
However, when you need a spreadsheet, you might not need all of the columns of the master file.
Or, you might want information that is held in other spreadsheets (like the TCP master file, or
a Ubiquity run of the data set). You might not want all of the rows of the master file.
This program takes information from the master data files stored on GitHub (the master meta data
sheet, a recent ubiquity run) as well as the TCP master metadata sheet (stored in their GitHub
repository) and "joins" them together - allowing you to build a single table. You can choose which
columns for each of the 3 data sets you want. You can also filter which rows you want.
A few quirks of the interface:
- The TCP master data is not loaded by default - you must press the button to load it. It can take a long time.
- After you change settings, you need to press "Rebuild Table" - it does not happen automatically
after each change of settings.
- You can sort and search on the table. When the data is written out, that is what you'll get.
- The table is meant as a preview - and uses a standard table viewing component (DataTables.net).
you probably want to write it out and look at in a program more suited for looking at a big table.
- The "use ubiq rows" option means that the spreadsheet will have one row per entry in the ubiq
data (even if that row has no corresponding row in the metadata file). Normally, there is a row for
every meta data element, even if there is no ubiq row.
When you have constructed a table that you are satisfied with, you can save it as a CSV file
for further analysis by pressing the "CSV" button.