Phi-3-mini: AI based small language models (SLMs)

Context: Microsoft unveiled the latest version of its lightweight Al model, the Phi-3-mini, reportedly the first among three small language models (SLMs) that the company plans to release. 

More about news: 

  • Phi-3-mini has 3.8 billion parameters (a measure of the size and complexity of an Al model) and is trained on a data set that is smaller in comparison to LLMs such as Open Al's GPT-4.
  • The amount of conversation that an AI can read and write at any given time is called the context window and is measured in tokens.
  • It is the first model in its class to support a context window which helps an Al model recall information during a session of up to 128,000 tokens, with little impact on quality. A token is the fundamental unit of data used by a language model to process and generate text.
  • Phi-3-mini requires less computational power and has much better latency.

About Small language models (SLMs):

  • SLMs are compact versions of large language models (LLMS), which can comprehend and generate human language text. 
  • The new model expands the selection of high-quality models for customers, offering more practical choices as they build generative Al applications.
  • It is more streamlined versions of large language models. When compared to LLMs, smaller AI models are also cost-effective to develop and operate, and they perform better on smaller devices like laptops and smartphones.
  • LLMs are trained on massive general data, while SLMs stand out with their specialisation. Through fine-tuning, SLMs can be customised for specific tasks - achieving accuracy and efficiency in the process. Most SLMs undergo targeted training, which demands considerably less computing power and energy compared to LLMs.
  • SLMs also differ from LLMs with reference to inference latency, which is the time taken for a model to make predictions or decisions after receiving input.
  • Their compact size allows for quicker processing, making them more responsive and apt for real-time applications such as virtual assistants and chat-bots and their cost makes them appealing to smaller organisations and research groups.
  • SLMs are highly versatile, useful in applications ranging from sentiment analysis to code generation. Their compact size and efficient computation also make them ideal for use on edge devices and in resource-limited settings.
  • Most popular SLMs: Llama 2 developed by Meta AI, Mistral and Mixtral, Microsoft’s Phi and Orca, Alpaca 7B and StableLM.

Read more about Large Language Models (LLMs):

Practice Question:

Q. With reference to the Small language models (SLMs), consider the following statements:

1. Small language models (SLMs) are trained on massive general data.

2. Small language models (SLMs) are cost-effective to develop and operate.

3. Small language models (SLMs) are useful in applications ranging from sentiment analysis to code generation.

4. Open Al’s GPT-4 is the latest version of small language models (SLMs).

How many of the above statements is/ are correct?

(a) Only one

(b) Only two

(c) Only three

(d) All four

Answer: (b)

Statement 1 is incorrect: LLMs are trained on massive general data, while SLMs are trained on specialisation. Through fine-tuning, SLMs can be customised for specific tasks – achieving accuracy and efficiency in the process.

Statement 2 is correct: When compared to LLMs, smaller AI models are also cost-effective to develop and operate, and they perform better on smaller devices like laptops and smartphones.

Statement 3 is correct: SLMs are highly versatile, useful in applications ranging from sentiment analysis to code generation. Their compact size and efficient computation also make them ideal for use on edge devices and in resource-limited settings.

Statement 4 is incorrect: Open Al’s GPT-4 is the latest version of Large language models (LLMs).


PYQ: (2022)

Q. With reference to Non-Fungible Tokens (NFTs), consider the following statements:

1. They enable the digital representation of physical assets.

2. They are unique cryptographic tokens that exist on a blockchain.

3. They can be traded or exchanged at equivalency and therefore can be used as a medium of commercial transactions.

Which of the statements given above are correct?

(a) 1 and 2 only

(b) 2 and 3 only

(c) 1 and 3 only

(d) 1, 2 and 3

Ans: (a)

Share this with friends ->

Leave a Reply

Your email address will not be published. Required fields are marked *

The maximum upload file size: 20 MB. You can upload: image, document, archive. Drop files here

Discover more from Compass by Rau's IAS

Subscribe now to keep reading and get access to the full archive.

Continue reading