Friday, April 26, 2019 (put topic here)

50/E048.30h-10h

Sophia, Maren, Luana, Malin, Thomas, Katharina, Josefine, Richard, Michael, Ulf

Excursion Today

Organisatorial Stuff

  • meeting at Rheine at around ~15h to visit the asparagus sorting machine

  • we can go there by car (3) or train

What do we want to do?

  • collect more data (take hard drive with us)

  • trying to install Richard’s tool

  • talk to Sophia’s brother again to explain

  • maybe try to get some labelled data

  • we still need the list of categories (which could be acquired once we’re there)

Project Management Tools

Presentation by Luana:

  • Trello

  • ClickUp

  • Asana

  • Google Drive

Vote for tool: Asana (no strong opinion)

Github

  • also has a project management system

  • we would not need another user account

  • university server can be used for file/image storage

  • folder name “Asparagus”

  • pathway: net/projects/scratch/summer/valid_until_31_January_2020/asparagus

  • when creating a folder, give access to other group members

New vote: Github

  • if it is not to our liking after two weeks we will use Asana

Github Project “Asparagus”

  • we will create other projects for later (smaller) tasks

Paper Research and Library

Presentation by Katharina and Josefine:

  • introduction of a library system for paper research

    • add short info (title, author, year, helpfulness etc) about any paper you read

    • the info categories are still up for debate

  • short summary of 5 papers

    • Donis-González and Guyer (2016) (asparagus sections)

    • Diaz et al. (2004) (table olives)

    • Pedreschi, Mery, and Marique (2016) (potatoes)

    • Kılıç et al. (2007) (beans)

    • Mery, Pedreschi, and Soto (2013) (general framework)

What papers might be interesting to look for?

  • image acquisition

    • illumination?

  • small subset of labelled data

    • key word: semisupervised learning

    • autoencoders

Other papers found (which could be added to the library?)

  • flower classification (Luana)

  • non-labelled images (Maren)

  • feature extraction (Malin)

Discussion: Usefulness of having a collection of papers

  • keeps it organized to avoid double reading

  • able to look up key words/papers for later problems

  • have a guideline

  • should not loose ourselves in too many details here

Open discussion

Image acquisition

  • little less than two months left for image acquisition

  • might be most important part right now – illumination should be fine (no further work on that needed)

3 remarks by Ulf

  • looking at literature is always good

  • give pros and cons for your rating of a paper

  • now we should focus on data acquisition (not going too deep into methods right now)

  • for future: should definitely have look at working with unbalanced datasets, because most other papers work with balanced data sets

Vote for library system of papers

  • keep it (8)

  • not useful enough (1)

Infos about program on classification machine

  • can store features of images, date etc.

  • on screen it seems it did well with borders etc.

  • the question is: why is the machine so bad, if the features look so well

    • if the features are bad we need new ones

    • but if the features are good it woud make the task easier because we could use the machine’s features

    • most difficult feature seemed the colouring

  • for future: if we have labelled data, it would be good to have a folder for each category

  • naming conventions (to avoid double naming/overcomplicated names)

Preparations & next session

To Do:

  • update about “playing” with pictures (Thomas)

    • cut image into 3 parts to have one piece of asparagus per picture

    • filter out background (find a bounding box around asparagus), i.e. have distinct bright background (to have no confusion with dirt/purple asparagus/dark, rusty stains)

    • work on preprocessing steps

    • reducing file size (right now it is ~5 GB)

    • goal: have images that all look the same (right now there are 1-3 asparagus per picture) and are small enough

  • have a look at Grid again and maybe also have a small Github “tutorial” meeting (Katharina)

    • you can have a look at the code of the service on github

    • meeting after regular Friday morning session (03.05.19, 10h)

  • start implementation (Malin, Richard, Maren)

    • reading up on implementing pictures/preprocessing papers

    • use algorithms learned in lectures

  • try to figure out how the machine program works to understand its feature selection (Sophia)

    • which features are extracted?

  • think about a guideline for the (manual) picture classification

    • create a cheat sheet with sorting instructions (step-by-step) (Josefine)

      • best would be same categories as the machine uses

    • look into having code for sorting pictures in different categories (to make work easy) (Michael)

  • look at unbalanced data sets (data set augmentation) and maybe semisupervised learning (Luana)

Topics for next week:

  • github: branching, naming conventions