Behind the Guardian’s analysis of 100 years of MPs’ language on immigration | Politics

The Guardian has found that there has been a significant shift to the right in sentiment on immigration among MPs speaking in the House of Commons over the past five years.

To conduct this analysis, the Guardian’s Data Science and Data Projects teams, in collaboration with University College London, developed an in-house machine learning model to measure language sentiment in debates in the House of Commons over a century.

Unlike off-the-shelf sentiment models, the Guardian’s version separates sentiment specifically about immigration from general emotive language about any topic.

Development of the model includes the following process:

The researchers first used a list of trigger terms manually designed and verified by immigration history experts to identify conversations likely to be about immigration. This process narrowed the data down to a manageable sample.

To ensure the results were not biased by keyword choice, the team stress-tested their findings; ran the analysis multiple times with different word combinations and evidenced similar findings regardless of specific term combinations.

To create the dataset on which the sentiment model was trained, a team of 12 manually labeled more than 1,250 parliamentary speeches and contributions of up to five sentences each over the course of a century.

Places where the piece was about immigration were identified as such and then classified as positive, negative, or neutral. Pieces not related to migration were classified as not related to migration.

The team also evaluated the performance of various Big Language Models, a form of generative AI, to label more parts; statistical tests deemed accuracy levels to be robust.

The use of AI in this project was limited to the annotation process, which increased the training dataset used to develop the Guardian machine learning model to more than 22,600 pieces of annotated parliamentary contributions over the last century.

This particular model was then applied to a century of debates and speeches in the House of Commons, capturing almost 238,000 immigration-related pieces between 1925 and the end of 2025, each given a “sentiment tag.”

The overall sentiment score for each year was calculated using only immigration-related sections (a full conversation might combine immigration-related sections with those that are not). The annual score was then calculated by subtracting the number of negative snippets from the positive ones and dividing the result by the number of all migration-related snippets. This was also done separately for the main parties highlighted in the analysis.

Because the model was created to measure sentiment in parliamentary speeches in aggregate, it was not used to report sentiment of individual contributions. The analysis also excludes periods where specific parties have not made sufficient contributions to immigration over a consistent period of time.

Behind the Guardian’s analysis of 100 years of MPs’ language on immigration | Politics

Leave a Reply Cancel reply

Champions League semi-final ‘special’ for Klopp

Dani Alves: The truth behind fights with Cristiano Ronaldo

Real Madrid v Espanyol Betting

Arsenal and Sutton communities teams deepen bonds

Dani Alves: The truth behind fights with Cristiano Ronaldo

Super Bowl 2017: Here’s How Many People Watched the Super Bowl

Related Articles

Leave a Reply Cancel reply