top of page
  • armstrongWebb

What's a Transformer? Who is HuggingFace? And why its tie-up with Amazon is important

Updated: Mar 30, 2021

Before I discuss this, a little bit of history around the evolution of AI.


In the past ten years - in addition to greater hardware power and data availability - there have been two large step-changes in AI modelling capability. Firstly, image recognition.


Image Recognition - the first big step forward in AI


Prior to the development of convolutional neural networks, around 2012, image recognition was pretty basic. Yes, we could recognise a set of image pixels as being, for example, a cat, but to identify the image, the cat might need to be on all fours, with its head toward the camera. And the size and shape of the cat, or the image resolution, might also restrict the image recognition capability.



Transformers and Natural Language Processing (NLP) - the second big step


NLP refers to the ability of an automated system to make sense of 'natural language' (eg a set of English sentences). Early systems were really glorified dictionaries, they could identify words, but were much poorer at associating context. For example, does the term 'match' refer to the thing you strike? Shorthand for a game? Seeing where two or more objects are the same etc.?


In June 2017, Google released a seminal paper titled Attention is all you need. It describes the concept of the Transformer, which now represents the state-of-the-art in NLP within neural networks.


Transformers are much better at establishing context, as they take the entire sentence - and beyond - into account. They are also far more powerful than previous automated systems, because they support parallel processing, providing a faster throughput. Context is established by analysing each sentence from left-to-right and right-to-left, in parallel.


Since 2017, improvements have been made to the original Transformer model. Iterations have included BERT (Bidirectional Encoder Representations from Transformers), to BART, which was developed by Facebook's AI division - with the associated paper published in late-October 2019. BART is described as "Denoising Sequence-to-Sequence Pre-training for Natural Language Generation, Translation, and Comprehension". And you can't say fairer than that!


The summarisation example, further down this post, uses the BART model.


Back to HuggingFace...


In 2019, New York-based HuggingFace was set-up. Its focus is on providing NLP models and associated 'toolkits'. Today, HuggingFace has over 7,000 NLP models and provides services to more than 5,000 organisations.


Why is its tie-up with Amazon important?


Amazon Web Services (AWS) is the world's largest Cloud services provider - in 2020, revenue was over $45 billion. It provides approximately 200 AWS Cloud-based services and continues to invest heavily in AI.


The HuggingFace tie-up should begin to see some very powerful NLP tools emerge, that are likely to have a significant impact on the workplace. And, given that a great deal of managerial work involves processing textual information to extract meaning, this is an area that could be significantly impacted. Impacted, for good or bad, depending on your viewpoint.


Example using the BART Transformer to summarise text


BART is a very powerful Transformer. A version of it, 'summarization' has been trained using over 300,000 unique news articles from CNN/Daily Mail. Each of these training articles contains its full text and, separately, highlighted parts of the text that indicate which parts are relevant for creating a summary. Hence, the Transformer uses this to 'understand' how best to create a summary from previously unseen articles.


Although, I don't usually display the programming code involved, I want you to see the main part of it, just to show how little coding is required to create the summary. Note: the (green) text with '#' are simply comments:


Code to generate a summary

Step by step walk-through of the code:

  1. Opens a file containing the text to be summarised. The file is named examples3.txt.

  2. Reads the contents of the file and stores it in full_text.

  3. Creates a Transformer 'summarization' object.

  4. Finally, it create the summary - by passing the summariser object the full_text . It also provides it with required minimum word length for the summary and the max length. The minimum word length is more a guide than an absolute requirement.

Four lines of code; that's all that is required! All of the 'clever' AI stuff happens within the very sophisticated BART model.


I ran this code with example3.txt containing the first 3 paragraphs of the unedited Wikipedia description of Einstein. Here is the full_text:


Albert Einstein (/ˈaɪnstaɪn/ EYEN-styne;[4] German: [ˈalbɛʁt ˈʔaɪnʃtaɪn] (About this soundlisten); 14 March 1879 – 18 April 1955) was a German-born theoretical physicist,[5] widely acknowledged to be one of the greatest physicists of all time. Einstein is known widely for developing the theory of relativity, but he also made important contributions to the development of the theory of quantum mechanics. Relativity and quantum mechanics are together the two pillars of modern physics. [3][6] His mass–energy equivalence formula E = mc2, which arises from relativity theory, has been dubbed "the world's most famous equation".[7] His work is also known for its influence on the philosophy of science.[8][9] He received the 1921 Nobel Prize in Physics "for his services to theoretical physics, and especially for his discovery of the law of the photoelectric effect",[10] a pivotal step in the development of quantum theory. His intellectual achievements and originality resulted in "Einstein" becoming synonymous with "genius".[11]


In 1905, a year sometimes described as his annus mirabilis ('miracle year'), Einstein published four groundbreaking papers. These outlined the theory of the photoelectric effect, explained Brownian motion, introduced special relativity, and demonstrated mass-energy equivalence. Einstein thought that the laws of classical mechanics could no longer be reconciled with those of the electromagnetic field, which led him to develop his special theory of relativity. He then extended the theory to gravitational fields; he published a paper on general relativity in 1916, introducing his theory of gravitation. In 1917, he applied the general theory of relativity to model the structure of the universe.[12][13] He continued to deal with problems of statistical mechanics and quantum theory, which led to his explanations of particle theory and the motion of molecules. He also investigated the thermal properties of light and the quantum theory of radiation, which laid the foundation of the photon theory of light. However, for much of the later part of his career, he worked on two ultimately unsuccessful endeavors. First, despite his great contributions to quantum mechanics, he opposed what it evolved into, objecting that nature "does not play dice".[14] Second, he attempted to devise a unified field theory by generalizing his geometric theory of gravitation to include electromagnetism. As a result, he became increasingly isolated from the mainstream of modern physics.


Einstein was born in the German Empire, but moved to Switzerland in 1895, forsaking his German citizenship the following year. In 1901, he acquired Swiss citizenship, which he kept for the rest of his life. In 1905, he was awarded a PhD by the University of Zurich. In 1933, while Einstein was visiting the United States, Adolf Hitler came to power. Einstein did not return to Germany because he objected to the policies of the newly elected Nazi-led government.[15] He settled in the United States and became an American citizen in 1940.[16] On the eve of World War II, he endorsed a letter to President Franklin D. Roosevelt alerting him to the potential German nuclear weapons program and recommending that the US begin similar research. Einstein supported the Allies, but he generally denounced the idea of nuclear weapons.



And, generated a few seconds later, by the trained BART model, is the 125 word 'summarised' output:


Albert Einstein was one of the greatest physicists of all time . He was born in Germany in 1879, but moved to Switzerland in 1895, forsaking his German citizenship the following year . In 1905, a year sometimes described as his annus mirabilis ('miracle year'), Einstein published four groundbreaking papers . His mass–energy equivalence formula E = mc2, which arises from relativity theory, has been dubbed "the world's most famous equation" His work is also known for its influence on the philosophy of science . He received the 1921 Nobel Prize in Physics "for his services to theoretical physics, and especially for his discovery of the law of the photoelectric effect", a pivotal step in the development of quantum theory" "Einstein" is synonymous with "genius".


The takeaway...


Transformers have only just begun to have an impact on NLP and its use in organisations.


The recent tie-up between HuggingFace and AWS will increase the sophistication of the resultant tools and their implementation - and impact - in the workplace.



Finally, for those that want a peek at a 'vanilla' Transformer's structure...


The diagram, below, summarises its key features. It is these components that enable the Transformer to make 'sense' of sequences - and hence context in NLP - retain memory and process in parallel, thereby realising a much greater throughput than with Recurrent Neural Networks.




Source: Vaswani et al (2017)




36 views0 comments

Comentarios


bottom of page