TOASTS
Official
Summary
We analyse the influence of multi-task learning strategies using task families for the English abstractive text summarization task. We group tasks into one of three strategies, i.e., sequential, simultaneous and continual multi-task learning, and evaluate trained models through two downstream tasks. We find that certain combination of task families (e.g., advanced reading comprehension and natural language inference) positively impact downstream performance. Further, we find that choice and combinations of task families influence downstream performance more than the training scheme, supporting the use of task families for abstractive text summarization.
We investigate the role of multi-task learning on English abstractive text summarization. Therefore, we organize 18 pre-selected training tasks into six higher-level, modular task families. Further, we compare three training schemes for the pre-finetuning stage and their respective mixing strategies through changes of multiple scores.
Architecture

Pre-Training
TOASTS groups selected pre-training tasks into task families and explores the correlation of these families, their influence on two downstream tasks, and the aggregation through three training schemes. Therefore, we use pre-finetuning, a second inexpensive pre-training stage between pre-training and fine-tuning, which was recently proposed by Muppet and tested by ExT5. Pre-finetuning has two main parts:
- task family setup - groups different tasks and related datasets into broader families according to their primary objective
- training strategies - tasks of these families are then combined following a training strategy and evaluated into a final task


Experiments


Performance








Further Readings