PhD thesis proposal (CIFRE LaTIM - AQUILAB)
Harmonization methods for radiomics
Mathieu Hatt and David Gibon
We are looking for a motivated student to conduct
a PhD under the CIFRE convention between the Laboratory of Medical Information
Processing (LaTIM, INSERM UMR 1101) and the society AQUILAB.
The PhD candidate will be recruited by AQUILAB
and will work full time for 3 years within the LaTIM. Both meetings via
visioconference and physical stays within the office of AQUILAB will be
scheduled regularly. The candidate must hold a master and/or an engineering
degree in computer sciences, machine learning or statistics. Additional
experience/expertise with medical imaging and clinical applications will be
considered a plus but is not a requisite. The candidate must be proficient in
English (both written and spoken).
A CV and a letter of motivation should be sent
to both hatt@univ-brest.fr and david.gibon@aquilab.com
Cancer is a major worldwide
health issue with 13 million cases and 8 million deaths reported in 2008.
Projections for 2030 are 22 million cases and 13 million deaths [1]. Nowadays, multimodal medical imaging such as computed
tomography (CT), positron emission tomography (PET/CT) and magnetic resonance
imaging (MRI) is crucial in oncology, with numerous applications including
early diagnosis, staging, treatment decision and planning, monitoring, and
patient follow-up [2]. A
patient’s treatment and follow-up strategy could be optimized based on improved
diagnosis (“virtual biopsy”) and predictive models able to identify
patients at risk of future local failures and recurrence at diagnosis, before
initiating treatment. In addition, the need to integrate data from several
sources (clinical, imaging, dosimetry, genetics, toxicity) in order to improve
predictive ability has been emphasized upon
[3].
Radiomics denotes the high
throughput extraction of numerous quantitative metrics (including shape,
intensity or textural features) of images with the goal of providing a full
macroscopic phenotyping of tissues (tumors, organs, etc.) that could reflect at
least in part the underlying pathophysiological processes (such as necrosis,
proliferation, etc.), down to the genomic level [4]. The field has seen an exponential growth in
the last 5 years (<10 publications in 2014, ~450 in 2018 and already the
same number in 2019). This is mostly due to its potential to provide a
quantitative signature of tumors’ characteristics that cannot necessarily be
appreciated with the naked eye, even the trained ones [5]. With
the quick and overwhelming development of deep learning in all fields of
science including medical imaging [6],
radiomics is evolving quite rapidly, techniques based on deep neural networks
being used either to automate or improve some parts of the radiomics workflow,
or to replace it entirely [7–9]. Recent
attempts to exploit CNN for radiomics and predictive modeling in oncology and
radiotherapy [10–12] were
met with limited success given the still numerous algorithmic and study design
challenges that need to be addressed within this context. A particular issue
concerns the limited amount of available data for training in medical imaging,
compared to other applications (e.g.,
the millions of images used in ImageNet [13]).
Radiomics has shown promising results in
identifying tumor subtypes, aggressiveness as well as in predicting response to
therapy and outcome of patients in several cancers [9], however,
most of these results have been obtained small, retrospective and monocentric cohorts. Importantly, recently published
recommendations and guidelines [9,14–16]
regarding the use of automated segmentation, standardized features, and proper
machine learning scheme for statistical analysis, have not been followed in the
majority of these studies except the most recent ones. As a result, even very famous studies in
radiomics have been criticized for being potentially false discovery [17], or even erroneous interpretation, for example
the radiomics signature being a surrogate of tumor volume [18]
In the following, we will therefore use the
term “radiomics” to denote any methodological workflow (relying or deep
learning or not) aiming at extracting clinically-relevant information from
images (whatever modality or scale).
Radiomics is currently
implemented in the research community as a rather complex sequential workflow,
which suffers from several limitations that hamper its potential transfer to
the clinical routine including the lack of i) standardization, ii) automation,
iii) harmonization and iv) the “black box” effect [9,14].
Methods relying on deep neural networks could
help in solving most of these limitations (e.g.,
fully automatic detection and segmentation of tumors in the images instead of
the most often semi-automatic approach followed in most studies) [8,9].
Figure 1 below illustrates the radiomics
workflow as well as how it can be implemented following the standard machine
learning or the more recent deep learning pipelines.
Figure 1: the radiomics workflow and its
implementation following machine or deep learning pipelines.
On the
one hand, standardization was identified early on as a major limitation
preventing radiomics to enter clinical practice, because of the lack of
comparability of the results. No meta-analysis could be carried out, because each research
group relied on different methodological workflows, software, nomenclature and implementation
choices, and did not provide sufficient details for their work to be reproduced
[14]. These issues have been addressed by the Imaging
Biomarker Standardization Initiative (IBSI) [15,19] and the radiomics ontology[1]
for the last two years.
On the
other hand, it has been shown for PET [20–22], CT [23,24] and MRI
[25,26] that
most radiomic features exhibit moderate to high sensitivity to variability in
scanner models, acquisition protocols and reconstruction settings, which
constitutes the biggest challenge for multicentric studies [27]. The lack of harmonization in scanner models,
reconstruction algorithms and acquisition protocols, leads to high inter
clinical site variability in images and resulting features. This is the current
clinical reality and it will not change. This is why currently proposed
radiomics and/or deep learning models are limited in terms of validation using
external datasets [30].
Our long term goal is to achieve societal
impact by improving patients management. This will be achieved thanks to more
robust and accurate predictive models that will help identify patients at risk
before initiating treatment. In order for these tools to be exploited in the
clinical routine a high level of proof is necessary, which in turn requires
larger scale, multicentric (ideally prospective) studies regarding the use of
radiomics and/or deep learning techniques in patients managements relying on
multimodal medical images, which are currently lacking. The objectives of this PhD are thus to develop
harmonization techniques in both image and feature domains in order to improve,
facilitate or even render feasible otherwise impossible radiomic analyses of large,
multicentric, heterogeneous cohorts in all types of multimodal imaging and
cancer applications.
Standardization can be
defined as a concept whereby agreement of results is achieved by establishing
traceability to higher order reference materials and / or measurement
procedures. Harmonization is defined as the process of making agreement in
order to produce a consistent interpretation where no reference measurement
procedure exist.
In the following, we use the term standardization to denote a process to achieve common
and standard practice, nomenclature, mathematical definitions and
implementation in the overall methodology workflow of radiomics. It may
also denote the similar process in achieving comparable acquisition protocols
and reconstruction settings in the generation of medical multimodal images,
such as computed tomography (CT), positron emission tomography (PET) or
magnetic resonance imaging (MRI). In the
present PhD we will rely on existing guidelines and standards such as the Imaging
Biomarker Standardization Initiative (IBSI) [15,19], in order to
ensure the highest level of standardization of our developments and to increase
the likelihood of the reproducibility of our results.
On the other hand, we will use the term harmonization to denote the process by which we
make multimodal medical images and/or features extracted from these images, comparable
and suitable for pooling, irrespectively of where and how they were produced.
Three
main approaches can be considered: i) harmonizing images before features
extraction, ii) harmonizing extracted features, iii) combination of i) and ii),
as well as transfer learning.
In order to harmonize images of a given
modality but from different clinical centers (acquired and reconstructed using
different scanner models/generation, acquisition protocols and/or
reconstruction algorithms and settings), we will rely on generative adversarial
networks (GANs) to transform images so they are made more similar to each other
while preserving their respective informative content, which will be the main
challenge.
We will
develop a GAN-based framework in which multicentric, heterogeneous images are
translated to match the properties of a standard dataset, such as a template
reference image, or alternatively, one set of images chosen as a reference (in
the absence of an appropriate standard). The first challenge is to determine
the relevant properties within images (local or global metrics, texture, edges,
contrast, signal-to-noise ratio, etc.) that should be reproduced. The second
challenge is to ensure the ability of the framework to harmonize images without
losing their clinically-relevant informative content, which will be one of the
most important criteria for its evaluation. Although numerous studies have shown the use
of GANs to synthesize images, such as generating a CT from an MRI for MR-based
radiotherapy treatment planning, these techniques have not as yet been
extensively exploited for the purpose of multicentric image harmonization. A
very recent example concerns the reduction of variance due to the use of
different kernels in CT reconstruction by relying on a CNN [31].
In the features (image-derived variables)
space, numerous statistical approaches can be applied, such as normalization [32] or batch effect compensation [33]. In radiomics the method ComBat, initially
developed for genomics batch correction through Bayesian estimates [33], has been used in order to carry out
multicentric studies [34]. It was chosen because it had been shown to
outperform other similar statistical approaches before [35]. Figure 2 below illustrates the benefit of
ComBat for multicentric validation of radiomics models.
Figure 2: Kaplan-Meier curves for loco-regional
control in locally advanced cervical cancer obtained using the FDG PET + ADC
map radiomics model in the testing multicentric cohort with and without
features harmonization using the standard ComBat approach [34].
However, ComBat suffers from a number of
limitations: a minimal number of annotated samples per batch are required, the
batch-corrected variables may lose their absolute values, hence their clinical
meaning because all features are moved to an arbitrary average reference, and
the harmonization is cumbersome to use for newly acquired patients or if an
additional center is added to the database (e.g., the harmonization using the
entire database has to be re-run entirely each time). We will improve these methods robustness (e.g., add Monte Carlo
estimation for small samples) and flexibility (the ability to choose a
reference amongst the available batches), as well as the ability to learn
transforms so they can be applied to newly added data. The combination of the
batch-correction methods with unsupervised clustering will also be investigated
to deal with data presenting very high heterogeneity and very small number of
samples per batch.
It might be beneficial and
complementary to combine image-based and feature based harmonization methodologies for improving results of
multicentric radiomics studies. This will involve evaluating the potential
added benefit of both previously developed approaches in improving the results.
The goal of this task will be to answer whether the first or the second
approach (or the combination of both) are the most efficient, taking into account not only the absolute
improvement observed in the results, but also the computing time and efforts
required to implement each approach in practice. This will be crucial,
especially to facilitate the transfer of the developed methods to the clinical
practice through industrial implementation.
Multiparametric models trained using standard
machine learning methods or deep neural networks cannot be directly applied to
external datasets with different properties. This task will investigate the use of transfer learning to solve the
issue, i.e., fine-tuning pre-trained
models (whether they are based on deep networks or more “shallow” modeling
approaches such as random forests or support vector machines) to allow them to
perform better in newly unseen data with important differences in images and/or
data.
The PhD
supervisor is Mathieu Hatt from the team ACTION (therapeutic action guided by
multimodal imaging in oncology) led by Dimitris Visvikis, in the Laboratory of
medical information processing, LaTIM (INSERM UMR 1101). M. Hatt is in charge
of the group “multiparametric modeling for therapy optimization” within the
team ACTION. Its main field of expertise is the development of image processing
and analysis methods, especially dedicated to positron emission tomography
(PET), such as automatic segmentation, partial volume effects correction or
filtering. On the topic of automatic PET image segmentation, the team
contributed to the report of the international taskgroup 211 of the AAPM
(American Association of Physicist in Medicine) [36,37] and organized
the first MICCAI challenge [38]. During the
last three years, the group has also developed its expertise around machine
(deep) learning methods and extended its expertise to CT and MR imaging.
Regarding radiomics, the team is amongst the pioneers with a first publication
in 2011 [39]. Since then,
the group has published more than 30 papers related to radiomics in PET, CT and
MRI, from methodological developments to more clinical applied studies. The
reviews, editorials and invited perspectives [7–9,14,27] also indicate
the level of recognition of the team on the topic.
Most of the
proposed developments will require extensive computing power in order to
process datasets, train and validate models (especially deep learning ones).
The LaTIM is managing a high performance computing platform (PLACIS, http://placis.univ-brest.fr/english)
which is a hybrid cluster with 800 CPU cores and 50 GPUs dedicated to
calculations, with a 150 TB storage facility. Access to this platform will be
granted to the PhD student.
Aquilab is a
French company created in 2000, based on a technological transfer of
methodological developments by research and clinical teams in Lille. The
society has been led by David Gibon since its creation. David Gibon has a
background in computer sciences and dedicated 10 years of his career to
research on radiotherapy and exploitation of medical imaging in therapeutic
action before founding Aquilab in 2000.
Aquilab
developed software solutions for quality control of medical imaging and
radiotherapy hardware. Its ARTISCAN solution is installed in more than 350
centers in the world and the company is leader in France equipping more than
80% of oncology centers. The company has also developed the ARTIVIEW solution
for preparing and evaluating radiotherapy treatment plans. For a few years, this
expertise has been associated to a web platform (Share Place) in order to
manage databases in imaging and radiotherapy multicentric trials.
The team ACTION has already established
a collaborative research effort with Aquilab. First, some methods and code (PET
image segmentation, radiomics) is under industrial transfer within the software
solution of Aquilab through the SATT Ouest valorisation. Second, a Labcom
(MALICE, machine learning against cancer) associating ACTION and Aquilab is
currently under submission for funding to the ANR. Finally, ACTION and Aquilab
are also partners in a large project on data sharing and analysis in pediatric
cancer under submission to INCa.
References
1. Bray F,
Ferlay J, Soerjomataram I, Siegel RL, Torre LA, Jemal A. Global cancer
statistics 2018: GLOBOCAN estimates of incidence and mortality worldwide for 36
cancers in 185 countries. CA Cancer J Clin. 2018;68:394–424.
2. Zafra M, Ayala F, Gonzalez-Billalabeitia E,
Vicente E, Gonzalez-Cabezas P, Garcia T, et al. Impact of
whole-body 18F-FDG PET on diagnostic and therapeutic management of Medical
Oncology patients. Eur J Cancer. 2008;44:1678–83.
3. Jaffray DA, Das S, Jacobs PM, Jeraj R, Lambin P. How Advances in
Imaging Will Affect Precision Radiation Oncology. Int J Radiat Oncol Biol Phys.
2018;101:292–8.
4. Segal E, Sirlin CB, Ooi C, Adler AS, Gollub J, Chen X, et al.
Decoding global gene expression programs in liver cancer by noninvasive
imaging. Nat Biotechnol. 2007;25:675–80.
5. Aerts H. Radiomics: there is more than meets the eye in medical
imaging. SPIE Med Imaging 2016 Comput-Aided Diagn [Internet]. 2016 [cited 2016
Sep 21]. p. 97850O-97850O – 1. Available from:
http://dx.doi.org/10.1117/12.2214251
6. Litjens G, Kooi T, Bejnordi BE, Setio AAA, Ciompi F, Ghafoorian M,
et al. A survey on deep learning in medical image analysis. Med Image Anal.
2017;42:60–88.
7. Visvikis D, Cheze Le Rest C, Jaouen V, Hatt M. Artificial
intelligence, machine (deep) learning and radio(geno)mics: definitions and
nuclear medicine imaging applications. Eur J Nucl Med Mol Imaging. 2019;
8. Hatt M, Parmar C, Qi J, Naqa IE. Machine (Deep) Learning Methods
for Image Processing and Radiomics. IEEE Trans Radiat Plasma Med Sci.
2019;3:104–8.
9. Hatt M, Le Rest CC, Tixier F, Badic B, Schick U, Visvikis D.
Radiomics: Data Are Also Images. J Nucl Med Off Publ Soc Nucl Med.
2019;60:38S-44S.
10. Antropova N, Huynh BQ, Giger ML. A deep
feature fusion methodology for breast cancer diagnosis demonstrated on three
imaging modality datasets. Med Phys. 2017;44:5162–71.
11. Bibault J-E, Giraud P, Housset M, Durdux C, Taieb J, Berger A, et
al. Deep Learning and Radiomics predict complete response after neo-adjuvant
chemoradiation for locally advanced rectal cancer. Sci Rep. 2018;8:12611.
12. Diamant A, Chatterjee A, Vallières M, Shenouda G, Seuntjens J.
Deep learning in head & neck cancer outcome prediction. Sci Rep.
2019;9:2764.
13. Russakovsky O, Deng J, Su H, Krause J, Satheesh S, Ma S, et al.
ImageNet Large Scale Visual Recognition Challenge. Int J Comput Vis.
2015;115:211–52.
14. Vallières M, Zwanenburg A, Badic B, Cheze Le Rest C, Visvikis D,
Hatt M. Responsible Radiomics Research for Faster Clinical Translation. J Nucl
Med Off Publ Soc Nucl Med. 2018;59:189–93.
15. Zwanenburg A, Leger S, Vallières M, Löck S, Initiative for the IBS. Image biomarker standardisation
initiative. ArXiv161207003 Cs [Internet]. 2016 [cited 2017 Oct 18]; Available
from: http://arxiv.org/abs/1612.07003
16. Zwanenburg A. Radiomics in nuclear medicine: robustness,
reproducibility, standardization, and how to avoid data analysis traps and
replication crisis. Eur J Nucl Med Mol Imaging. 2019;
17. Chalkidou A, O’Doherty MJ, Marsden PK. False Discovery Rates in
PET and CT Studies with Texture Features: A Systematic Review. PloS One.
2015;10:e0124165.
18. Welch ML, McIntosh C, Haibe-Kains B, Milosevic MF, Wee L, Dekker
A, et al. Vulnerabilities of radiomic signature development: The need for
safeguards. Radiother Oncol J Eur Soc Ther Radiol Oncol. 2019;130:2–9.
19. Hatt M, Vallieres M, Visvikis D, Zwanenburg A. IBSI: an
international community radiomics standardization initiative. J Nucl Med.
2018;59:287–287.
20. Galavis PE, Hollensen C, Jallow N, Paliwal B, Jeraj R. Variability
of textural features in FDG PET images due to different acquisition modes and
reconstruction parameters. Acta Oncol. 2010;49:1012–6.
21. Yan J, Chu-Shern JL, Loi HY, Khor LK, Sinha AK, Quek ST, et al.
Impact of Image Reconstruction Settings on Texture Features in 18F-FDG PET. J
Nucl Med. 2015;56:1667–73.
22. Pfaehler E, Beukinga RJ, de Jong JR, Slart RHJA, Slump CH, Dierckx
RAJO, et al. Repeatability of 18 F-FDG PET radiomic features: A phantom study
to explore sensitivity to image reconstruction settings, noise, and delineation
method. Med Phys. 2019;46:665–78.
23. Mackin D, Fave X, Zhang L, Fried D, Yang J, Taylor B, et al.
Measuring Computed Tomography Scanner Variability of Radiomics Features. Invest
Radiol. 2015;50:757–65.
24. Berenguer R, Pastor-Juan MDR, Canales-Vázquez J, Castro-García M,
Villas MV, Mansilla Legorburo F, et al. Radiomics of CT Features May Be
Nonreproducible and Redundant: Influence of CT Acquisition Parameters.
Radiology. 2018;288:407–15.
25. Yang F, Dogan N, Stoyanova R, Ford JC. Evaluation of radiomic
texture feature error due to MRI acquisition and reconstruction: A simulation
study utilizing ground truth. Phys Medica PM Int J Devoted Appl Phys Med Biol
Off J Ital Assoc Biomed Phys AIFB. 2018;50:26–36.
26. Um H, Tixier F, Bermudez D, Deasy JO, Young RJ, Veeraraghavan H.
Impact of image preprocessing on the scanner dependence of multi-parametric MRI
radiomic features and covariate shift in multi-institutional glioblastoma
datasets. Phys Med Biol. 2019;64:165011.
27. Hatt M, Lucia F, Schick U, Visvikis D. Multicentric validation of
radiomics findings: challenges and opportunities. EBioMedicine. 2019;
28. Fiset S, Welch ML, Weiss J, Pintilie M, Conway JL, Milosevic M, et
al. Repeatability and reproducibility of MRI-based radiomic features in
cervical cancer. Radiother Oncol J Eur Soc Ther Radiol Oncol. 2019;135:107–14.
29. Traverso A, Kazmierski M, Shi Z, Kalendralis P, Welch M, Nissen
HD, et al. Stability of radiomic features of apparent diffusion coefficient
(ADC) maps for locally advanced rectal cancer in response to image
pre-processing. Phys Medica PM Int J Devoted Appl Phys Med Biol Off J Ital
Assoc Biomed Phys AIFB. 2019;61:44–51.
30. Zwanenburg A, Löck S. Why validation of prognostic models matters?
Radiother Oncol. 2018;127:370–3.
31. Choe J, Lee SM, Do K-H, Lee G, Lee J-G, Lee SM, et al. Deep
Learning-based Image Conversion of CT Reconstruction Kernels Improves Radiomics
Reproducibility for Pulmonary Nodules or Masses. Radiology. 2019;292:365–73.
32. Chatterjee A, Vallières M, Dohan A, Levesque IR, Ueno Y, Saif S,
et al. Creating robust predictive radiomic models for data from independent
institutions using normalization. IEEE Trans Radiat Plasma Med Sci. 2019;1–1.
33. Johnson WE, Li C, Rabinovic A. Adjusting batch effects in
microarray expression data using empirical Bayes methods. Biostat Oxf Engl. 2007;8:118–27.
34. Lucia F, Visvikis D, Vallières M, Desseroit
M-C, Miranda O, Robin P, et al. External validation of a combined PET and
MRI radiomics model for prediction of recurrence in cervical cancer patients
treated with chemoradiotherapy. Eur J Nucl Med Mol Imaging. 2019;46:864–77.
35. Chen C, Grennan K, Badner J, Zhang D, Gershon E, Jin L, et al.
Removing batch effects in analysis of expression microarray data: an evaluation
of six batch adjustment methods. PloS One. 2011;6:e17238.
36. Hatt M, Lee JA, Schmidtlein CR, Naqa IE, Caldwell C, De Bernardi
E, et al. Classification and evaluation strategies of auto-segmentation
approaches for PET: Report of AAPM task group No. 211. Med Phys. 2017;44:e1–42.
37. Berthon B, Spezi E, Galavis P, Shepherd T, Apte A, Hatt M, et al.
Toward a standard for the evaluation of PET-Auto-Segmentation methods following
the recommendations of AAPM task group No. 211: Requirements and
implementation. Med Phys. 2017;44:4098–111.
38. Hatt M, Laurent B, Ouahabi A, Fayad H, Tan S, Li L, et al. The
first MICCAI challenge on PET tumor segmentation. Med Image Anal. 2018;44:177–95.
39. Tixier F, Le Rest CC, Hatt M, Albarghach N, Pradier O, Metges JP,
et al. Intratumor heterogeneity characterized by textural features on baseline
18F-FDG PET images predicts response to concomitant radiochemotherapy in
esophageal cancer. J Nucl Med. 2011;52:369–78.