The Rise of Open-Source LLMs and Its List

Manikanth
9 min readJul 24, 2023

--

Source: https://explodinggradients.com/the-rise-of-open-source-large-language-models

Content

1 — Introduction of LLMs

2 — Open Sourcing of LLMs

3 — List of Open Source Large Language Models (LLMs)

Introduction of LLMs

Large language models (LLMs) are a type of artificial intelligence (AI) that are trained on massive datasets of text and code. They can be used for a variety of tasks, including generating text, translating languages, and writing different kinds of creative content.

LLMs work by learning the statistical relationships between words and phrases in a language. This allows them to generate text that is both grammatically correct and semantically meaningful.

Open sourcing LLMs

In recent years, there has been a growing interest in open-source LLMs. These models are released under open-source licenses, which means that anyone can use, modify, and distribute them. This has made it possible for researchers, developers, and businesses to experiment with LLMs and to develop new applications for them.

There are a number of benefits to using open-source LLMs. First, they are often more affordable than proprietary LLMs. Second, they are more transparent, which means that researchers can study how they work and how they make decisions. Third, they are more flexible, which means that they can be customized for different tasks.

There are also some challenges associated with using open-source LLMs. First, they can be complex to use and to train. Second, they can be computationally expensive to run. Third, they can be used for malicious purposes, such as generating fake news or spam.

Despite these challenges, open-source LLMs have the potential to revolutionize the way we interact with computers. They have the ability to automate tasks that are currently done by humans, and they can be used to create new and innovative applications.

List of Open Source Large Language Models (LLMs)

Recently, the world of Natural Language Processing (NLP) has witnessed a phenomenal surge in the development and release of Large Language Models (LLMs). This trend can be largely attributed to the resounding success of models like ChatGPT, which have shown remarkable capabilities in understanding and generating human-like text. However, the monopoly of LLMs in the hands of a few tech giants has given rise to a growing demand for open-source alternatives. In response, the open-source community has taken up the challenge and has been actively creating their own LLMs. These open-source LLMs offer several advantages, such as faster development pace, lower alignment costs, and increased transparency.

With such a vast and dynamic landscape of open-source LLMs, it becomes increasingly challenging to keep track of all the models being released on a daily basis. Hence, this article aims to provide a comprehensive list of open-source LLMs currently available, along with information about their licensing options and source code repositories. Let’s dive into the world of open-source LLMs!

SAIL 7B

Description: Search augmented instruction learning based on LLaMa model.
Params: 7B
License: GPL-3.0 license
Release Date:
May 25,2023
Github: Source Code
Paper: SAIL — Search Augmented Instruction Learning

Guanaco

Description: LLM model released with efficient finetuning approach QLoRA
Params: 65B
License: MIT
Release Date:
May 24, 2023
Github: Source Code
Paper: QLoRAEfficient Finetuning of Quantized LLMs

RMKV

Description: An RNN with transformer-level LLM performance
Params: 100M–14B
License: Apache 2.0
Release Date:
May 15,2023
Github: Source Code
Paper: Scaling RNN to 1.5B and Reach Transformer LM Performance

MPT-7B

Description: MosaicML’s Foundation series models
Params: 7B
License:
Apache 2.0
Release Date:
May 5, 2023
Github: Source Code
Paper: MPT-7BA New Standard for Open-Source, Commercially Usable LLMs

OpenLLaMa

Description: Another Open source reproduction of Meta AI’s LLaMA 7B trained on the RedPajama dataset.
Params: 3,7B
License: Apache 2.0
Release Date:
May 5, 2023
Github: Source Code
Paper: Meet OpenLLaMA — An Open-Source Reproduction of Meta AI’s LLaMA Large Language Model

RedPajama-INCITE

Description: Instruction tuned and chat models based on Pythia model trained on RedPajama dataset.
Params: 3B, 7B
License: Apache 2.0
Release Date:
May 5,2023
Github: Source Code
Paper: RedPajama-INCITE family of models including base, instruction-tuned & chat models

h2oGPT

Description: H2O’s fine-tuning framework and chatbot UI with document(s) question-answer capabilities
Params: 12B,30B
License: Apache 2.0
Release Date:
May 3,2023
Github: Source Code
Paper: Building the World’s Best Open-Source Large Language Model: H2O.ai’s Journey

FastChat-T5

Description: Chatbot trained by fine-tuning Flan-t5-xl on user-shared conversations collected from ShareGPT
Params: 3B
License: Apache 2.0
Release Date:
Apr 28,2023
Github: Source Code
Paper: FastChat-T5 — our compact and commercial-friendly chatbot!

GPT4All

Description: Ecosystem to train and deploy powerful and customized LLMs
Params: 7–13B
License: MIT
Release Date:
Apr 24,2023
Github: Source Code
Paper: GPT4All: An ecosystem of open-source on-edge large language models.

MiniGPT-4

Description: Visual LLM Model based on BLIP-2 and Vicuna LLM
Params: 13B
License: BSD-3-Clause
Release Date:
Apr 20,2023
Github: Source Code
Paper: MiniGPT-4 — Enhancing Vision-Language Understanding with
Advanced Large Language Models

StableLM

Description: Stability AI’s LLM model series
Params: 7B
License: CC BY-NC-SA-4.0
Release Date:
Apr 19, 2023
Github: Source Code
Paper: Stability AI Launches the First of its StableLM Suite of Language Models

BloomZ

Description: Cross-lingual Generalization through Multitask Finetuning
Params: 176B
License: Apache 2.0
Release Date:
Apr 19,2023
Github: Source Code
Paper: Cross-lingual Generalization through Multitask Finetuning

Dolly

Description: Pythia 12B LLM trained on Databricks ML platform
Params: 12B
License: Apache 2.0
Release Date:
Apr 12,2023
Github: Source Code
Paper: Free Dolly — Introducing the World’s First Truly Open Instruction-Tuned LLM

Baize Chatbot

Description: Open Source Chat model based on LLaMa
Params: 30B
License: GPL-3.0 license
Release Date:
Apr 10, 2023
Github: Source Code
Paper: Baize — An Open-Source Chat Model with Parameter-Efficient Tuning on Self-Chat Data

ColossalChat

Description: A complete RLHF Pipeline released open source by ColossalAI
Params: N/A
License: Apache 2.0
Release Date:
Apr 6,2023
Github: Source Code
Paper: ColossalChat — An Open-Source Solution for Cloning ChatGPT With a Complete RLHF Pipeline

Lit LLaMa

Description: Open Source Implementation of LLaMA from Lightning AI
Params: 13B
License: Apache 2.0
Release Date:
Apr 4, 2023
Github: Source Code
Paper: Why We’re Building Lit-LLaMA

Cerebras-GPT

Description: Family of Open, Compute-efficient, Large Language Models
Params: 111M-13B
License: Apache 2.0
Release Date:
Mar 28,2023
Github: Source Code
Paper: Cerebras-GPT — Open Compute-Optimal Language Models
Trained on the Cerebras Wafer-Scale Cluster

Open Flamingo

Description: Open source implementation of Deepmind’s Flamingo model
Params: 9B
License: MIT License
Release Date:
Mar 28,2023
Github: https://github.com/mlfoundations/open_flamingo
Paper: Openflamingo — An Open-source Framework For Training Vision-language Models With In-context Learning

Chat GLM

Description: Open bilingual (English & Chinese) bidirectional dense pre-trained model
Params: 6B-130B
License: Apache 2.0
Release Date:
Mar 23,2023
Github: Source Code
Paper: GLM-130B: An Open Bilingual Pre-trained Model

DLite

Description: Instruction-following model from AI Squared by fine tuning the smallest GPT-2 model on the Alpaca dataset
Params: 124M
License: Apache 2.0
Release Date:
Mar 16,2023
Github: Source Code
Paper: Introducing DLite, a Lightweight ChatGPT-Like Model Based on Dolly

Alpaca 7B

Description: Stanford’s Instruction-following LLaMA Model
Params: 7B
License: Apache 2.0
Release Date:
Mar 13,2023
Github: Source Code
Paper: Alpaca — A Strong, Replicable Instruction-Following Model

Flan UL2

Description: Flan 20B model was trained on top of pre-trained UL2 checkpoint.
Params: 20B
License: MIT License
Release Date:
Mar 3,2023
Github: Source Code
Paper: A New Open Source Flan 20B with UL2

Flan-T5

Description: Instruction finetuning of T5 on various datasets for improving usability of pre-trained language models
Params: 60M–11B
License: Apache 2.0
Release Date:
Feb 1,2023
Github: Source Code
Paper: Scaling Instruction-Finetuned Language Models

Open Assistant

Description: Project meant to give everyone access to a great chat based large language model.
Params: N/A
License: Apache 2.0
Release Date:
Dec 11, 2022
Github: Source Code
Paper: Open Assistant — Assistant for the Future

Galactica

Description: General-purpose scientific language model trained on scientific texts
Params: 120M-120B
License: Apache 2.0
Release Date:
Nov 16,2022
Github: Source Code
Paper: Galactica — A Large Language Model for Science

Bloom

Description: Largest Multilingual Open Access LM model from BigScience
Params: 176B
License: OpenRAIL-M v1
Release Date: Nov 9, 2022
Github: Source Code
Paper: BLOOM — A 176B-Parameter Open-Access Multilingual
Language Model

UL2

Description: An Open Source Unified Language Learner from Google research
Params: 20B
License: MIT License
Release Date:
Nov 3,2022
Github: Source Code
Paper: UL2 — Unifying Language Learning Paradigms

Tk-Instruct

Description: LLM from AllenAI that is tuned to solve many NLP tasks by following instructions.
Params: 3,7B
License: MIT License
Release Date:
Oct 24,2022
Github: Source Code
Paper: SUPER-NATURALINSTRUCTIONS:
Generalization via Declarative Instructions on 1600+ NLP Tasks

YaLM

Description: Pretrained LLM from Yandex for for generating and processing text
Params: 100B
License: Apache 2.0
Release Date:
June 19,2022
Github: Source Code
Paper: Yandex Open-Sources YaLM Model With 100 Billion Parameters

OPT

Description: Series of open-sourced causal LLMs released by MetaAI that perform similar to GPT-3
Params: 125M-175B
License: MIT License
Release Date:
May 2,2022
Github: Source Code
Paper: OPT — Open Pre-trained Transformer Language Models

GPT-NeoX

Description: Eluether AI’s open source version of GPT with fewer params
Params: 20B
License: Apache 2.0
Release Date:
Apr 14,2022
Github: Source Code
Paper: GPT-NeoX-20B — An Open-Source Autoregressive Language Model

GPT-J

Description: Eluether AI’s open source version of GPT with fewer params
Params: 6B
License: Apache 2.0
Release Date:
Jun 4,2021
Github: Source Code
Paper: GPT-J-6B: 6B JAX-Based Transformer

Switch

Description: A trillion-parameter AI language model developed by Google
Params: 1.6T
License: MIT License
Release Date:
Feb 16,2021
Github: Source Code
Paper: Switch Transformers: Scaling to Trillion Parameter Models
with Simple and Efficient Sparsity

Older Models

XLNet

Description: A generalized autoregressive pre-training model that loops over all permutations of the factorization order.
Params: 340M
License: Apache 2.0
Release Date:
June 19,2019
Github: Source Code
Paper: XLNet: Generalized Autoregressive Pretraining for Language Understanding

GPT-2

Description: Second iteration of a language model using the Transformer architecture from OpenAI
Params: 1.5B
License: MIT License
Release Date:
Feb 4,2019
Github: Source Code
Paper: Language Models are Unsupervised Multitask Learners

BERT

Description: Language Representation Model with Transformer base and Masked Language Modelling(MLM) as pre-training objective.
Params: 340M
License: Apache 2.0
Release Date:
Oct 11,2018
Github: Source Code
Paper: BERT — Pre-training of Deep Bidirectional Transformers for Language Understanding

GPT-1

Description: First iteration of a language model using the Transformer architecture from OpenAI
Params: 117M
License: MIT License
Release Date:
June 11,2018
Github: Source Code
Paper: Improving Language Understanding by Generative Pre-Training

To Learn more about LLMs Subscribe to my channel

What do the Licenses mean?

- Apache 2.0: The Apache 2.0 license is a permissive open-source license that allows free use, modification, and distribution of the model’s source code. Users are also allowed to sublicense the model under different terms.

- MIT License: The MIT License is another permissive open-source license that grants users the freedom to use, modify, and distribute the model’s source code without any restriction. It is widely used in the open-source community due to its simplicity and flexibility.

- GPL-3.0 License: The GNU General Public License 3.0 is a copyleft license that requires any derivative work or modifications of the model to be distributed under the same license terms. It emphasizes the principles of open-source software and ensures that the code remains freely available to the public.

- BSD-3-Clause License: The 3-Clause BSD License is a permissive license that allows users to use, modify, and distribute the model’s source code, with the added condition that proper attribution must be given to the original authors.

- CC BY-NC-SA-4.0 License: The Creative Commons Attribution

-NonCommercial-ShareAlike 4.0 International License allows users to use, modify, and distribute the model’s source code for non-commercial purposes, as long as they provide appropriate attribution and use the same license when distributing their derivative work.

In conclusion, the landscape of open-source Large Language Models is rapidly evolving, with numerous models being released regularly by the open-source community. These models offer an exciting opportunity for developers, researchers, and enthusiasts to experiment with cutting-edge language technologies without the constraints of proprietary systems. As more organizations and individuals contribute to the development of these models, we can expect to see even more powerful, accessible, and innovative language models that will shape the future of Natural Language Processing.

--

--

Manikanth
Manikanth

Written by Manikanth

Data scientist | Helping business leverage their data using machine learning to drive results. https://linktr.ee/manikanthp

No responses yet