🎉 initial commit

2021-06-09 16:55:29 +02:00 · 2021-06-09 16:55:29 +02:00 · 50b368874c
commit 50b368874c
parent 3625a0f9c2
2 changed files with 225 additions and 1 deletions
--- a/README.md
+++ b/README.md
@ -1,2 +1,78 @@
-# tensorboard_image_extractor
+# Tensorboard Image Extractor
 ## What is this program and why?
 This is short script to extract images and create animated gif from there
 using a tensorboard event file. Unfortunatly this feature is not native inside
 tensorboard, inside which only the graphing data can downloaded (in `csv` or
 `json` format).
 The only other program which I found that did a similar thing is
 https://github.com/lanpa/tensorboard-dumper/ which I took inspiration from.
 ## How to use it
 The repository can be clone with git and the you will maybe need to install
 some dependencies (like tensorboard):
 ```
 pip3 install -r requirements.txt
 ```
 You can then run it:
 ```
 python3 tensorboard_image_extractor.py -i event.db
 ```
 You can get some help by running:
 ```
 python3 tensorboard_image_extractor.py --help 
 ```
 ## Tensorboard datastructure
 The following diagram describes a tree of the log directory found in all
 machine learning experiment with a tensorboard writer.
 ```
 logs/
 ├── lupo
 │   ├── args.txt
 │   ├── config.txt
 │   ├── model_005000.pth
 │   ├── model_010000.pth
 │   ├── model_015000.pth
 │   ├── model_020000.pth
 │   └── model_025000.pth
 └── summaries
    └── lupo
        └── events.out.tfevents.1623155921.pop-os
 ```
 The file which contains all data and images is the `event` file in
 `logs/summaries/{run name}/events`. It can be fairly large because every image
 is stored inside in binary format.
 ## Example
 You can create an animated gif, only keeping images with a certain `tag`:
 ```
 python3 tensorboard_image_extractor.py -i lupo.events -t "train/level_1/rgb" -o train_level_1_rgb_24h.gif --gif
 ```
 ## Performance
 In order to create a gif from a 900 MB event file, it took me just over an
 hour. This is due to the fact that Python has to do the I/O reading from binary
 data and converting the whole file, which is remarkably slow.
 It can create large gif files. In the experiment described above the images of
 a single tag was kept and it created a 52 MB gif file.
 ## Notes
 This program is distributed under GNU GNL v3 or later License, which you can
 find a copy of in the repository.
 This program comes with ABSOLUTELY NO WARRANTY
 Tensorboard Image Extractor - Copyright (C) 2021 - Otthorn
--- a/tensorboard_image_extractor.py
+++ b/tensorboard_image_extractor.py
@ -0,0 +1,148 @@
 # Tensorboard Image Extractor Copyright (C) 2021 Otthorn
 # License: GNU GPL v3 or later
 import argparse
 import io
 import tensorboard.compat.proto.event_pb2 as event_pb2
 from PIL import Image
 from tqdm import tqdm
 def read_event(data):
    """
    Read one event from the datastream.
    Returns the event as a string and the trucated data without the event that
    was read.
    """
    h0 = int.from_bytes(data[:8], "little")
    event_str = data[12 : 12 + h0]
    data = data[12 + h0 + 4 :]
    return data, event_str
 def read_file(input_path):
    """
    Read a file.
    Read a file and return the data, throws an error and exits if no file is
    found.
    """
    try:
        with open(input_path, "rb") as f:
            data = f.read()
        return data
    except FileNotFoundError:
        print(f"Input file {input_path} is not a valid path.")
        exit()
 def decode_image(img):
    """Decodes an image"""
    d_img = Image.open(io.BytesIO(img.encoded_image_string))
    return d_img
 def main(args):
    data = read_file(args.input)
    original_length = len(data)
    pbar = tqdm(total=original_length)
    img_list = []
    while data:
        data, event_str = read_event(data)
        pbar.n = original_length - len(data)
        pbar.update(0)
        event = event_pb2.Event()
        event.ParseFromString(event_str)
        if event.HasField("summary"):
            for value in event.summary.value:
                if value.HasField("image"):
                    tag = value.ListFields()[0][1]
                    # if args.Nons is None process everything, else process
                    # only the given tag
                    if args.tag is None or args.tag == tag:
                        img = value.image
                        img_d = decode_image(img)
                        # sanitize tag
                        tag = tag.replace("/","_")
                        tag = tag.replace(" ","_")
                        if args.gif:
                            # save an image list for the gif
                            img_list.append(img_d)
                        else:
                            print(f"Saving as: img_{tag}_{event.step}.png")
                            img_d.save(f"img_{tag}_{event.step}.png", format="png")
    if args.gif:
        # save as an animated gif
        print("[DEBUG] saving animated gif")
        im = img_list[0]
        im.save(
            args.output,
            save_all=True,
            append_images=img_list,
            duration=args.second_per_frame,
            loop=args.do_not_loop,
        )
 if __name__ == "__main__":
    parser = argparse.ArgumentParser(
        description="Tensorboard image dumper and gif creator"
    )
    parser.add_argument(
        "--input",
        "-i",
        type=str,
        help="Input file, must be a tensorboard event file",
        required=True,
    )
    parser.add_argument(
        "--output",
        "-o",
        type=str,
        help="Output file for the gif, must have a .gif extension",
    )
    parser.add_argument(
        "--gif",
        default=False,
        action="store_true",
        help="Save the ouptut as an animated gif",
    )
    parser.add_argument(
        "--do-not-loop",
        default=True,
        action="store_false",
        help="Prevent the gif from looping",
    )
    parser.add_argument(
        "--second-per-frame",
        "-spf",
        type=int,
        default=60,
        help="Time between each frame (in milisecond)",
    )
    parser.add_argument(
        "--tag",
        "-t",
        type=str,
        help="Select a single tag for the ouptut",
    )
    args = parser.parse_args()
    main(args)