F.A.Q. - Grand Challenge

Frequently Asked Questions

--- Challenge ---

1. What is CholecTriplet2021?
An Endoscopic vision challenge on the recognition of tool-tissue interactions in surgical videos in the form of triplets.

2. What is a triplet?
A combination of {instrument,verb,anatomical target} that describes a surgical action.

3. Who can participate?
Anyone who signs the challenge agreement except the members of the organizing lab.

4. Will the challenge submission still be open after the submission deadline?
Likely yes, but only submissions made before the deadline will be eligible for awards.

--- Registration ---

1. Why is my registration not yet approved?
For your registration to be approved, you must send a signed challenge contract. Check the Getting Started page for more details.

2. Must every member of my team submit a signed contract?
One contract per team is sufficient. However, every member of the team must abide by the terms and conditions in the signed contract. The dataset obtained after signature of this contract remains confidential and cannot be transferred to someone outside the team.

3. Is it mandatory to register as a team?
Yes.

4. What if I am working alone, must I still register as a team?
Yes. A team can comprise of only one person.

5. Must every member of my team register on the challenge website?
Yes.

6. When is the deadline for team registration?
August 30, 2021.

--- CholecT50 dataset ---

1. What labels in the dataset should be used to train the model?
Triplet labels. The standalone instrument labels, verb labels, and target labels are provided as additional labels should in case they can help your modeling. Their usage is optional and entirely depends on your propose methods.

2. I found some action triplet marked null for the verb & target components in the ground truth, is this an error?
No, some clinically valid triplets are not in the considered 100 triplet classes, due to their occurrence frequency and clinical relevance to the considered procedure.

3. How many types of null triplets are possible in the dataset?
The possible null triplet classes can be grouped into two as follows:
a.) Instrument inclusive null: this is seen in a situation where there is no instrument in the frame, or where the instrument involved in the action is not from valid classes in the dataset. In this case, the label is {null-instrument,null-verb,null-target}. Since triplet is a multi-label classification problem, this class is true only when all other classes are negative in one frame. It is NOT included in the 100 triplet classes.

b.) Non-instrument inclusive null: this is a situation where a valid instrument class is present, but verb or target involved in the action is from an invalid class, or the triplet combination is not in the considered 100 classes due to reason in (2) above. In this case, we retain only the instrument presence label while verb/target are marked null. They are 6 classes from such situation:

 {grasper,null-verb,null-target}, {bipolar,null-verb,null-target}, {hook,null-verb,null-target}, {scissors,null-verb,null-target}, {clipper,null-verb,null-target}

, and {irrigator,null-verb,null-target}.

4. Is the train/val/test split different from the one used in the published papers?
Yes. For the challenge, we restrict the test set to only videos that are not in the public domain. We recommend 40/5 train/val split on the provided data but participants are entirely free to define their own splits.

5. Is the challenge test dataset publicly available?
No, while the trained dataset are 45 videos of publicly available Cholec80 [1], the test set is a private dataset (not Cholec80) of the same type of surgery.

6. I have observed that there are some black images in the dataset, are these images corrupted?
No, as a privacy protection measure, we zeroed out all images that display the faces of the clinicians or the patients. For temporal consistency reason, we do not remove the zeroed frames from the dataset.

--- My Challenge Methods ---

1. Do I need to predict the instrument, verb and target separately?
No, while you may want to leverage the extra annotations provided for instrument, verb and target to improve your model, you are only required to predict the final triplet IDs as a vector[100] of probability scores.

2. Will the inference pipeline preserve the temporal information?
Yes. Participants are free to train their model either on sequential or shuffled frames. During testing, we will use an input setup that preserves the temporal frame order per video.

3. What is the frame-rate for the test set?
1 FPS. Same as the train data.

4. Is the testing going to be an online prediction?
Yes. We will maintain a real-time scenario during testing. This means that our test input setup will collect your model's outputs at time t before feeding the input frame at time t+1. Your method can accumulate and utilize previous frames information but not a future one.

5. My model performance is quite low, do I still need to submit?
Triplet recognition task is generally challenging. The average performance of a random model is 0.01%. So if you beat this performance, you have a good method to submit for the competition.

--- Baseline Methods ---

1. Where can I find a published paper/article on surgical triplet recognition?
[Nwoye C.I. et.al, Recognition of Instrument-Tissue Interactions in Endoscopic Videos via Action Triplets, MICCAI 2020]
Please, note that the model in the paper [2] is trained and evaluated on CholecT40 (a subset of CholecT50)

The journal version [3] of the baseline [2] is trained on CholecT50:
[Nwoye C.I. et.al, Rendezvous: Attention Mechanisms for the Recognition of Surgical Action Triplets in Endoscopic Videos]

2. Where can I find a trained model (code) on triplet recognition?
The code is not public at moment, but we provide a sample code in the colab code blog to help you get started.
Note, we do not provide the weights for any sample/published model.

3. Must I follow the same strategy as in the published papers?
No, you are free to develop any method that works for you: deep learning, machine learning, rule-based inference, etc.

4. Can I submit exactly the same model as in the published papers?
Submitting original and novel method is highly recommended, however, you are not constrained on what to submit.

--- Training ---

1. Is pretraining on a surgical dataset allowed?
Yes, you are free to pretrain your model on any third-party public dataset.

--- Submission ---

1. How do I submit my method?
Methods are to be submitted as docker file. We will provide a docker template and submission guideline by late July 2021.
The submission channel will open on Aug 10, 2021.

--- Evaluation ---

1. How do you evaluate a model prediction on the null triplets?
In this challenge, every null triplet class is excluded in the performance evaluation. However, your submission will comprise the complete 100 class predictions. If you excluded the null classes during training, you can fill the values with 0s.

--- Publication ---

1. Will my challenge submission be published?
We plan a joint publication of the surgical action triplet recognition which will include the submitted challenge models and results. More information will be provided on this as time goes on.

2. Who will be co-authors?
Every participating team can submit at most 2 qualifying authors. The sub-challenge organizers determine the order of the authors in a joint challenge paper.

3. When can a participant publish an independent research on this dataset?
Participants are allowed to publish their own results separately only after a publication of a joint challenge paper.

4. When will the joint results be published?
This should be expected before the end of 2022.

--- References ---

[1] Twinanda, A. P., Shehata, S., Mutter, D., Marescaux, J., De Mathelin, M., & Padoy, N. (2016). Endonet: a deep architecture for recognition tasks on laparoscopic videos. IEEE transactions on medical imaging, 36(1), 86-97.

[2] Nwoye, C. I., Gonzalez, C., Yu, T., Mascagni, P., Mutter, D., Marescaux, J., & Padoy, N. (2020, October). Recognition of instrument-tissue interactions in endoscopic videos via action triplets. In International Conference on Medical Image Computing and Computer-Assisted Intervention (pp. 364-374). Springer, Cham.

[3] Nwoye, C. I., Yu, T., Gonzalez, C., Seeliger, B., Mascagni, P., Mutter, D., Marescaux, J., & Padoy, N. (2021, September). Rendezvous: Attention Mechanisms for the Recognition of Surgical Action Triplets in Endoscopic Videos. arXiv preprint arXiv:2109.03223.