choochoo

Training Diary

View the Project on GitHub andrewcooke/choochoo

Repairing FIT Files

Choochoo includes a tool that will attempt to repair FIT format files. It does this by trying to discard data until it finds a valid file (this works because a FIT file is mainly a sequence of repeated records, one for each GPS point; if we remove a corrupt record we still have nearly all the data).

For Dummies

The FIT format is complex so it is difficult to make a simple, automatic tool, but if you do not want to read this documentation the following might work:

> ch2 fix-fit INPUT.FIT --add-header --drop --max-drop-cnt 2 -o OUTPUT.FIT

To run that you first need to install Choochoo.

Overview

The program runs in a series of steps:

Modifications made to the data should be logged at the “warning” level in the logs.

Not Quite So Dummy

Given the above we can unpack

> ch2 fix-fit INPUT.FIT --add-header --drop --max-drop-cnt 2 -o OUTPUT.FIT

The --add-header means that a new header is added to the beginning of the data. Since the data may already contain a header this will introduce a parsing error (unless you are very unlucky and the old header looks like valid data). Using --drop will hopefully then discard the old header.

The likely result of the recipe above, then, is to replace the old header with a new one. If we had not used --add-header the old header would have been “fixed” (see Header and Checksums), but that would not have corrected any issues if a byte (or more) of data was actually missing from the header (since fixing assumes that the data have the correct length).

If the original header was OK, we did a bunch of work for no reason and the problem is elsewhere in the file. This is why --max-drop-cnt 2 is given - it allows --drop to drop a further region of data (making two drops in total) and hopefully fix the issue.

So in total, the recipe allows for two fixes: one to the header and another somewhere in the data.

Finally the file length and checksums are fixed and the data saved to OUTPUT.FIT.

Add Header

The --header-size, --protocol-version and --profile-version flags can be used to fine-tune the header. Default values are taken from FR35 FIT files. To see the defaults run the program without specifying the option and read the logs (--discard is useful here - see Output). Values for the current (old) header are also logged.

Note that adding the header increases the data size. So if used with --slices you must take this into account (and add --header-size to your indices).

For example, the following command tests replacing an existing 14 byte header:

> ch2 fix-fit INPUT.FIT --add-header --header-size 14 --slices :14,28: --discard

Drop Data

This is the heart of the algorithm. A depth first search is made to delete corrupt data, allowing the remaining data to be parsed. The details of this search can be inferred from the parameters below, but if you really, really want to understand you will need to read the source.

The following parameters influence the search:

The slices determined by the search are printed to the log like this:

INFO: Found slices :14,28:28034,28057:97451

Slices

The slices are a comma-separated list of lo:hi pairs, where either limit can be omitted. The syntax is intended to match the Python array syntax, so if the data are in the array data, then lo:hi identifies the data data[lo:hi].

The existing checksum should not be included in the slices. To help with this a final, open slice has the “stop” value replaced with -2. So : (all data) would be changed to :-2 (all data but the last two bytes - the checksum).

Most slices will result in data that cannot be parsed, and so will fail validation (see Validation). This is why it is best to use the slices found by --drop (see Drop Data).

Timestamps

If --start is given, a shift in time is calculated by comparing this with the first timestamp in the file. This offset it then applied to all timestamps.

Rewriting timestamps will require that the checksum is updated, so you should also use --fix-checksum (see below).

Header and Checksums

Once data have been modified the file header (for --fix-header) and checksum (for --fix-checksum) are updated appropriately. If necessary, new values for --protocol-version and --profile-version can be specified on the command line.

(These two operations are done “together” because one affects the other; in practice they are repeated as necessary until consistent.)

Validation

The corrected data are parsed once more to validate that the changes are correct. This can be skipped with --no-validate.

Output

By default, the results are printed to stdout in “hex” format (or raw binary if --raw is given). They can be saved to a file (in raw binary format) using -o or --output. Alternatively, the output can be discarded using --discard (this is useful when fine-tuning parameters).

Alternatively, if --name-bad or --name-good is given, then the names (only) of good (or bad, respectively) files is printed to stdout. Typically this is used with -v 0 to suppress logging.

Further Reading