diff-gather-stats
diff-gather-stats
¶
Generate a configurable report or a summary of annotation results. Each summary is saved as a single JSON file. Various subcommands compute different types of statistics.
To create annotation results, run the diff-annotate
command.
Usage:
console
$ diff-gather-stats [OPTIONS] COMMAND [ARGS]...
Options:
--annotations-dir DIR_NAME
: Subdirectory to read annotations from; use '' to do without such [default: annotation]--help
: Show this message and exit.
Commands:
purpose-counter
: Calculate count of purposes from all bugs...purpose-per-file
: Calculate per-file count of purposes from...lines-stats
: Calculate per-bug and per-file count of...timeline
: Calculate timeline of bugs with per-bug...list-added-lines
: List added lines from all bugs in provided...
diff-gather-stats purpose-counter
¶
Calculate count of purposes from all bugs in provided datasets
Each dataset is expected to be existing directory with the following structure:
<dataset_directory>/<bug_directory>/annotation/<patch_file>.json
Each dataset can consist of many bugs, each bug should include patch of annotated *diff.json file in 'annotation/' subdirectory.
Usage:
console
$ diff-gather-stats purpose-counter [OPTIONS] DATASETS...
Arguments:
DATASETS...
: [required]
Options:
-o, --output JSON_FILE
: JSON file to write gathered results to--help
: Show this message and exit.
diff-gather-stats purpose-per-file
¶
Calculate per-file count of purposes from all bugs in provided datasets
Each dataset is expected to be existing directory with the following structure:
<dataset_directory>/<bug_directory>/annotation/<patch_file>.json
Each dataset can consist of many BUGs, each BUG should include patch of annotated *diff.json file in 'annotation/' subdirectory.
Usage:
console
$ diff-gather-stats purpose-per-file [OPTIONS] RESULT_JSON DATASETS...
Arguments:
RESULT_JSON
: JSON file to write gathered results to [required]DATASETS...
: list of dirs with datasets to process [required]
Options:
--help
: Show this message and exit.
diff-gather-stats lines-stats
¶
Calculate per-bug and per-file count of line types in provided datasets
Each dataset is expected to be existing directory with the following structure:
<dataset_directory>/<bug_directory>/annotation/<patch_file>.json
Each dataset can consist of many BUGs, each BUG should include patch of annotated *diff.json file in 'annotation/' subdirectory.
Usage:
console
$ diff-gather-stats lines-stats [OPTIONS] OUTPUT_FILE DATASETS...
Arguments:
OUTPUT_FILE
: JSON file to write gathered results to [required]DATASETS...
: list of dirs with datasets to process [required]
Options:
--purpose-to-annotation PURPOSE:LINE_TYPE|PURPOSE
: Mapping from file PURPOSE to line type LINE_TYPE. Each line of such file will be treated as if it had given type. As a shortcut, giving PURPOSE is the same as PURPOSE:PURPOSE. Can be given multiple times.--help
: Show this message and exit.
diff-gather-stats timeline
¶
Calculate timeline of bugs with per-bug count of different types of lines
For each bug (bugfix commit), compute the count of lines removed and added by the patch (commit) in all changed files, keeping separate counts for lines with different types, and (separately) with different purposes.
The gathered data is then saved in a format easy to load into dataframe.
Each DATASET is expected to be generated by annotating dataset or creating annotations from a repository, and should be an existing directory with the following structure:
<dataset_directory>/<bug_directory>/annotation/<patch_file>.json
Each dataset can consist of many BUGs, each BUG should include JSON file with its diff/patch annotations as *.json file in 'annotation/' subdirectory (by default).
Saves gathered timeline results to the OUTPUT_FILE.
Usage:
console
$ diff-gather-stats timeline [OPTIONS] OUTPUT_FILE DATASETS...
Arguments:
OUTPUT_FILE
: file to write gathered results to [required]DATASETS...
: list of dirs with datasets to process [required]
Options:
--purpose-to-annotation PURPOSE:LINE_TYPE|PURPOSE
: Mapping from file PURPOSE to line type LINE_TYPE. Each line of such file will be treated as if it had given type. As a shortcut, giving PURPOSE is the same as PURPOSE:PURPOSE. Can be given multiple times.--help
: Show this message and exit.
diff-gather-stats list-added-lines
¶
List added lines from all bugs in provided datasets
Each dataset is expected to be existing directory with the following structure:
<dataset_directory>/<bug_directory>/annotation/<patch_file>.json
Each dataset can consist of many bugs, each bug should include patch of annotated *diff.json file in 'annotation/' subdirectory.
Usage:
console
$ diff-gather-stats list-added-lines [OPTIONS] DATASETS...
Arguments:
DATASETS...
: [required]
Options:
--help
: Show this message and exit.