Content
1 — Introduction of LLMs
2 — Open Sourcing of LLMs
3 — List of Open Source Large Language Models (LLMs)
Introduction of LLMs
Large language models (LLMs) are a type of artificial intelligence (AI) that are trained on massive datasets of text and code. They can be used for a variety of tasks, including generating text, translating languages, and writing different kinds of creative content.
LLMs work by learning the statistical relationships between words and phrases in a language. This allows them to generate text that is both grammatically correct and semantically meaningful.
Open sourcing LLMs
In recent years, there has been a growing interest in open-source LLMs. These models are released under open-source licenses, which means that anyone can use, modify, and distribute them. This has made it possible for researchers, developers, and businesses to experiment with LLMs and to develop new applications for them.
There are a number of benefits to using open-source LLMs. First, they are often more affordable than proprietary LLMs. Second, they are more transparent, which means that researchers can study how they work and how they make decisions. Third, they are more flexible, which means that they can be customized for different tasks.
There are also some challenges associated with using open-source LLMs. First, they can be complex to use and to train. Second, they can be computationally expensive to run. Third, they can be used for malicious purposes, such as generating fake news or spam.
Despite these challenges, open-source LLMs have the potential to revolutionize the way we interact with computers. They have the ability to automate tasks that are currently done by humans, and they can be used to create new and innovative applications.
List of Open Source Large Language Models (LLMs)
Recently, the world of Natural Language Processing (NLP) has witnessed a phenomenal surge in the development and release of Large Language Models (LLMs). This trend can be largely attributed to the resounding success of models like ChatGPT, which have shown remarkable capabilities in understanding and generating human-like text. However, the monopoly of LLMs in the hands of a few tech giants has given rise to a growing demand for open-source alternatives. In response, the open-source community has taken up the challenge and has been actively creating their own LLMs. These open-source LLMs offer several advantages, such as faster development pace, lower alignment costs, and increased transparency.
With such a vast and dynamic landscape of open-source LLMs, it becomes increasingly challenging to keep track of all the models being released on a daily basis. Hence, this article aims to provide a comprehensive list of open-source LLMs currently available, along with information about their licensing options and source code repositories. Let’s dive into the world of open-source LLMs!
SAIL 7B
Description: Search augmented instruction learning based on LLaMa model.
Params: 7B
License: GPL-3.0 license
Release Date: May 25,2023
Github: Source Code
Paper: SAIL — Search Augmented Instruction Learning
Guanaco
Description: LLM model released with efficient finetuning approach QLoRA
Params: 65B
License: MIT
Release Date: May 24, 2023
Github: Source Code
Paper: QLoRA — Efficient Finetuning of Quantized LLMs
RMKV
Description: An RNN with transformer-level LLM performance
Params: 100M–14B
License: Apache 2.0
Release Date: May 15,2023
Github: Source Code
Paper: Scaling RNN to 1.5B and Reach Transformer LM Performance
MPT-7B
Description: MosaicML’s Foundation series models
Params: 7B
License: Apache 2.0
Release Date: May 5, 2023
Github: Source Code
Paper: MPT-7B — A New Standard for Open-Source, Commercially Usable LLMs
OpenLLaMa
Description: Another Open source reproduction of Meta AI’s LLaMA 7B trained on the RedPajama dataset.
Params: 3,7B
License: Apache 2.0
Release Date: May 5, 2023
Github: Source Code
Paper: Meet OpenLLaMA — An Open-Source Reproduction of Meta AI’s LLaMA Large Language Model
RedPajama-INCITE
Description: Instruction tuned and chat models based on Pythia model trained on RedPajama dataset.
Params: 3B, 7B
License: Apache 2.0
Release Date: May 5,2023
Github: Source Code
Paper: RedPajama-INCITE family of models including base, instruction-tuned & chat models
h2oGPT
Description: H2O’s fine-tuning framework and chatbot UI with document(s) question-answer capabilities
Params: 12B,30B
License: Apache 2.0
Release Date: May 3,2023
Github: Source Code
Paper: Building the World’s Best Open-Source Large Language Model: H2O.ai’s Journey
FastChat-T5
Description: Chatbot trained by fine-tuning Flan-t5-xl on user-shared conversations collected from ShareGPT
Params: 3B
License: Apache 2.0
Release Date: Apr 28,2023
Github: Source Code
Paper: FastChat-T5 — our compact and commercial-friendly chatbot!
GPT4All
Description: Ecosystem to train and deploy powerful and customized LLMs
Params: 7–13B
License: MIT
Release Date: Apr 24,2023
Github: Source Code
Paper: GPT4All: An ecosystem of open-source on-edge large language models.
MiniGPT-4
Description: Visual LLM Model based on BLIP-2 and Vicuna LLM
Params: 13B
License: BSD-3-Clause
Release Date: Apr 20,2023
Github: Source Code
Paper: MiniGPT-4 — Enhancing Vision-Language Understanding with
Advanced Large Language Models
StableLM
Description: Stability AI’s LLM model series
Params: 7B
License: CC BY-NC-SA-4.0
Release Date: Apr 19, 2023
Github: Source Code
Paper: Stability AI Launches the First of its StableLM Suite of Language Models
BloomZ
Description: Cross-lingual Generalization through Multitask Finetuning
Params: 176B
License: Apache 2.0
Release Date: Apr 19,2023
Github: Source Code
Paper: Cross-lingual Generalization through Multitask Finetuning
Dolly
Description: Pythia 12B LLM trained on Databricks ML platform
Params: 12B
License: Apache 2.0
Release Date: Apr 12,2023
Github: Source Code
Paper: Free Dolly — Introducing the World’s First Truly Open Instruction-Tuned LLM
Baize Chatbot
Description: Open Source Chat model based on LLaMa
Params: 30B
License: GPL-3.0 license
Release Date: Apr 10, 2023
Github: Source Code
Paper: Baize — An Open-Source Chat Model with Parameter-Efficient Tuning on Self-Chat Data
ColossalChat
Description: A complete RLHF Pipeline released open source by ColossalAI
Params: N/A
License: Apache 2.0
Release Date: Apr 6,2023
Github: Source Code
Paper: ColossalChat — An Open-Source Solution for Cloning ChatGPT With a Complete RLHF Pipeline
Lit LLaMa
Description: Open Source Implementation of LLaMA from Lightning AI
Params: 13B
License: Apache 2.0
Release Date: Apr 4, 2023
Github: Source Code
Paper: Why We’re Building Lit-LLaMA
Cerebras-GPT
Description: Family of Open, Compute-efficient, Large Language Models
Params: 111M-13B
License: Apache 2.0
Release Date: Mar 28,2023
Github: Source Code
Paper: Cerebras-GPT — Open Compute-Optimal Language Models
Trained on the Cerebras Wafer-Scale Cluster
Open Flamingo
Description: Open source implementation of Deepmind’s Flamingo model
Params: 9B
License: MIT License
Release Date: Mar 28,2023
Github: https://github.com/mlfoundations/open_flamingo
Paper: Openflamingo — An Open-source Framework For Training Vision-language Models With In-context Learning
Chat GLM
Description: Open bilingual (English & Chinese) bidirectional dense pre-trained model
Params: 6B-130B
License: Apache 2.0
Release Date: Mar 23,2023
Github: Source Code
Paper: GLM-130B: An Open Bilingual Pre-trained Model
DLite
Description: Instruction-following model from AI Squared by fine tuning the smallest GPT-2 model on the Alpaca dataset
Params: 124M
License: Apache 2.0
Release Date: Mar 16,2023
Github: Source Code
Paper: Introducing DLite, a Lightweight ChatGPT-Like Model Based on Dolly
Alpaca 7B
Description: Stanford’s Instruction-following LLaMA Model
Params: 7B
License: Apache 2.0
Release Date: Mar 13,2023
Github: Source Code
Paper: Alpaca — A Strong, Replicable Instruction-Following Model
Flan UL2
Description: Flan 20B model was trained on top of pre-trained UL2 checkpoint.
Params: 20B
License: MIT License
Release Date: Mar 3,2023
Github: Source Code
Paper: A New Open Source Flan 20B with UL2
Flan-T5
Description: Instruction finetuning of T5 on various datasets for improving usability of pre-trained language models
Params: 60M–11B
License: Apache 2.0
Release Date: Feb 1,2023
Github: Source Code
Paper: Scaling Instruction-Finetuned Language Models
Open Assistant
Description: Project meant to give everyone access to a great chat based large language model.
Params: N/A
License: Apache 2.0
Release Date: Dec 11, 2022
Github: Source Code
Paper: Open Assistant — Assistant for the Future
Galactica
Description: General-purpose scientific language model trained on scientific texts
Params: 120M-120B
License: Apache 2.0
Release Date: Nov 16,2022
Github: Source Code
Paper: Galactica — A Large Language Model for Science
Bloom
Description: Largest Multilingual Open Access LM model from BigScience
Params: 176B
License: OpenRAIL-M v1
Release Date: Nov 9, 2022
Github: Source Code
Paper: BLOOM — A 176B-Parameter Open-Access Multilingual
Language Model
UL2
Description: An Open Source Unified Language Learner from Google research
Params: 20B
License: MIT License
Release Date: Nov 3,2022
Github: Source Code
Paper: UL2 — Unifying Language Learning Paradigms
Tk-Instruct
Description: LLM from AllenAI that is tuned to solve many NLP tasks by following instructions.
Params: 3,7B
License: MIT License
Release Date: Oct 24,2022
Github: Source Code
Paper: SUPER-NATURALINSTRUCTIONS:
Generalization via Declarative Instructions on 1600+ NLP Tasks
YaLM
Description: Pretrained LLM from Yandex for for generating and processing text
Params: 100B
License: Apache 2.0
Release Date: June 19,2022
Github: Source Code
Paper: Yandex Open-Sources YaLM Model With 100 Billion Parameters
OPT
Description: Series of open-sourced causal LLMs released by MetaAI that perform similar to GPT-3
Params: 125M-175B
License: MIT License
Release Date: May 2,2022
Github: Source Code
Paper: OPT — Open Pre-trained Transformer Language Models
GPT-NeoX
Description: Eluether AI’s open source version of GPT with fewer params
Params: 20B
License: Apache 2.0
Release Date: Apr 14,2022
Github: Source Code
Paper: GPT-NeoX-20B — An Open-Source Autoregressive Language Model
GPT-J
Description: Eluether AI’s open source version of GPT with fewer params
Params: 6B
License: Apache 2.0
Release Date: Jun 4,2021
Github: Source Code
Paper: GPT-J-6B: 6B JAX-Based Transformer
Switch
Description: A trillion-parameter AI language model developed by Google
Params: 1.6T
License: MIT License
Release Date: Feb 16,2021
Github: Source Code
Paper: Switch Transformers: Scaling to Trillion Parameter Models
with Simple and Efficient Sparsity
Older Models
XLNet
Description: A generalized autoregressive pre-training model that loops over all permutations of the factorization order.
Params: 340M
License: Apache 2.0
Release Date: June 19,2019
Github: Source Code
Paper: XLNet: Generalized Autoregressive Pretraining for Language Understanding
GPT-2
Description: Second iteration of a language model using the Transformer architecture from OpenAI
Params: 1.5B
License: MIT License
Release Date: Feb 4,2019
Github: Source Code
Paper: Language Models are Unsupervised Multitask Learners
BERT
Description: Language Representation Model with Transformer base and Masked Language Modelling(MLM) as pre-training objective.
Params: 340M
License: Apache 2.0
Release Date: Oct 11,2018
Github: Source Code
Paper: BERT — Pre-training of Deep Bidirectional Transformers for Language Understanding
GPT-1
Description: First iteration of a language model using the Transformer architecture from OpenAI
Params: 117M
License: MIT License
Release Date: June 11,2018
Github: Source Code
Paper: Improving Language Understanding by Generative Pre-Training
To Learn more about LLMs Subscribe to my channel
What do the Licenses mean?
- Apache 2.0: The Apache 2.0 license is a permissive open-source license that allows free use, modification, and distribution of the model’s source code. Users are also allowed to sublicense the model under different terms.
- MIT License: The MIT License is another permissive open-source license that grants users the freedom to use, modify, and distribute the model’s source code without any restriction. It is widely used in the open-source community due to its simplicity and flexibility.
- GPL-3.0 License: The GNU General Public License 3.0 is a copyleft license that requires any derivative work or modifications of the model to be distributed under the same license terms. It emphasizes the principles of open-source software and ensures that the code remains freely available to the public.
- BSD-3-Clause License: The 3-Clause BSD License is a permissive license that allows users to use, modify, and distribute the model’s source code, with the added condition that proper attribution must be given to the original authors.
- CC BY-NC-SA-4.0 License: The Creative Commons Attribution
-NonCommercial-ShareAlike 4.0 International License allows users to use, modify, and distribute the model’s source code for non-commercial purposes, as long as they provide appropriate attribution and use the same license when distributing their derivative work.
In conclusion, the landscape of open-source Large Language Models is rapidly evolving, with numerous models being released regularly by the open-source community. These models offer an exciting opportunity for developers, researchers, and enthusiasts to experiment with cutting-edge language technologies without the constraints of proprietary systems. As more organizations and individuals contribute to the development of these models, we can expect to see even more powerful, accessible, and innovative language models that will shape the future of Natural Language Processing.