At the same time as the release of GPT-4, Anthropic, which is considered to be an important rival of OpenAI, also released Claude today, a product that performs no less than ChatGPT.
ChatGPT's strongest competitor, Claude, opens API today.
Claude is a new AI assistant, similar to ChatGPT, launched by Anthropic, an AI startup collectively created by former employees who left OpenAI. As an AI conversation assistant, Claude claims to be built based on cutting-edge NLP and AI safety technologies, with the goal of becoming a safe, human-value-oriented, and ethical AI system.
Anthropic claims to be an AI safety company with public benefit (PBC) and announced that it had received $124 million in financing as soon as it was established. It was founded in 2021 by Dario Amodei, former vice president of research at OpenAI, and 10 employees.
In AI, the deviation between intention and result is called the alignment problem. When the alignment problem occurs in real life, it can bring serious ethical risks. For example, Amazon once used AI to help screen resumes. Since most of the training data were male resumes, AI would give low scores when it encountered female resumes.
Alignment problems occur all the time in our daily lives. For example, when we go for interviews, apply for loans, or even undergo physical examinations, we may be affected by AI "bias" without knowing it. Therefore, it is very important to align AI with human values.
Although large language model technology is developing rapidly, Dario Amodei, former vice president of research and security at OpenAI, believes that there are still many security issues in large models that have not been resolved, which prompted him to lead the core authors of GPT-2 and GPT-3 to leave OpenAI and establish Anthropic.
Anthropic was founded in January 2021 and has published 15 research papers since its establishment. Its vision is to build reliable, interpretable, and steerable AI systems. Constitutional AI is one of Anthropic's most important research results. It allows humans to specify a set of behavioral norms or principles for AI, without having to manually label each harmful output, so that harmless artificial intelligence models can be trained. In January 2023, Anthropic began publicly testing Claude's language model assistant based on Constitutional AI technology. After many comparisons, Claude, which is still in the testing stage, is no less than OpenAI's ChatGPT.
Since its establishment, Anthropic has grown to a team of about 80 people, with a financing amount of more than US$1.3 billion and a latest valuation of US$4.1 billion. Historical investors include Skype founder Jaan Tallinn, FTX founder Sam Bankman-Fried, Google, Spark Capital, and Salesforce Ventures. Anthropic has reached strategic cooperation with Google and Salesforce, using the cloud services provided by Google and integrating them into Slack.
Anthropic has a luxurious team and a great vision. It ranks among the top three companies in the field of AI frontier models, along with OpenAI and DeepMind (Google), and is the only startup that is not deeply tied to a large company. Its large language model, Claude, is the biggest competitor of OpenAI's ChatGPT.
Background
In 2016, an AI researcher was trying to use reinforcement learning technology to let AI play hundreds of games. While monitoring AI playing games, he found that in a rowing game, the AI rowing boat would repeatedly circle back and forth in one place in each round, instead of reaching the finish line to complete the game.
It turned out that some scoring props would appear where the AI rowing boat circled. When the AI got the points, before turning back, new scoring props had been refreshed. In this way, the AI rowing boat actually kept eating these scoring props repeatedly, falling into a loop and not completing the game. Doing this did get the most points, but this was not the researcher's purpose. The researcher's goal was to let AI win the game, but it is more complicated to define the concept of "winning the game" with an algorithm. For example, human players will consider factors such as the distance between the boats, the number of laps, and the relative position. Therefore, the researcher chose a relatively simple concept—"points"—as the reward mechanism. That is, when the AI eats more points props, the AI will win. This strategy worked fine in the ten games he tried (such as racing), but only in the eleventh game, the boat race, did the problem arise.
This phenomenon made the researcher very worried because he was studying general artificial intelligence and wanted AI to do what humans would do, especially those things that humans find difficult to fully state or express. If this were a manned "autonomous driving" motorboat, the consequences could have been disastrous.
This deviation between intention and result is called the alignment problem. Humans are usually not good at or unable to explain the detailed reward mechanism, and always miss some important information, such as "We actually want this speedboat to finish the race."
There are many similar examples. For example, in a physical simulation environment, a researcher wanted the robot to move the green puck and hit the red puck. As a result, he found that the robot always moved the green puck to a position close to the red puck first, and then hit the puck table to make the two pucks collide. Since the algorithm optimizes the distance between the two pucks, although AI did nothing wrong, it obviously did not meet the researcher's expectations.
When alignment problems occur in real life, they bring more serious moral risks. For example, Amazon once used AI to help screen resumes. Since most of the training data were male resumes, when AI encountered female resumes, it would give low scores. The COMPAS system, a tool used to predict criminal risks based on criminal records and personal information, also showed biases. Some people found that black defendants were more likely to be misjudged as having a higher risk of reoffending than white defendants. Google Photos even once labeled black people's photos as "gorillas."
Alignment problems occur all the time in our daily lives. For example, when we go for interviews, apply for loans, or even have physical examinations, we may be affected by AI "bias" without knowing it. Therefore, it is very important to ensure that AI and human values are aligned.
With the rapid development of large language model technology, the way of human-computer interaction is changing rapidly, but humans still do not understand AI principles and AI safety well enough. Although the boat racing game is virtual, more and more people in the artificial intelligence community believe that if we are not careful enough, this is a true portrayal of the end of the world—that is, the world will be destroyed by unsafe AI created by humans. And at least today, humans have lost the game.
The researcher who used AI to play the boat racing game was Dario Amodei, who later became OpenAI's vice president of research and security. In 2021, he was dissatisfied with OpenAI's rapid commercialization when the large language model technology was not safe enough, and led a group of people to leave OpenAI and founded Anthropic.
Research Direction
Anthropic is an artificial intelligence security and research company with a vision to build reliable, interpretable, and controllable AI systems. Anthropic believes that today's large general-purpose systems have great advantages, but they can also be unpredictable, unreliable, and opaque, and these are exactly the issues that Anthropic is very concerned about.
Anthropics' research interests include natural language, human feedback, scaling laws, reinforcement learning, code generation, and interpretability. Since its founding, it has published 15 papers:
Alignment Problem
1. A General Language Assistant as a Laboratory for Alignment
The tool proposed in this paper is the infrastructure for Anthropic to study alignment problems. Anthropic conducts alignment experiments and future research based on this. In the example shown in the figure, people can input any task for AI to complete. In each round of dialogue, AI will give two results, and humans choose a more helpful and honest answer as the result. This tool can both perform A/B testing on different models and collect human feedback.
2. Training a Helpful and Harmless Assistant with Reinforcement Learning from Human Feedback
This paper mainly introduces how to use human feedback to train a large language model that is helpful and harmless. This alignment training method using human feedback not only improves all NLP evaluation indicators but can also be compatible with other tasks such as Python programming or summarization.
3. Language Models (Mostly) Know What They Know
If we want to train an honest AI system, then AI must be able to evaluate its own knowledge level and reasoning ability. That is, AI needs to know what it knows and what it doesn't know. This study found that large language models have the ability to predict in advance whether they can answer questions correctly, and also have the ability to generalize.
Interpretability
1. A Mathematical Framework for Transformer Circuits
Anthropic believes that if you want to understand the working mechanism of large language models, you should first understand the working mechanism of some small and simple transformer models. This paper proposes a mathematical framework for reverse-engineering transformer language models, hoping to reverse a transformer language model like a programmer reverses the source code from a binary file, and then fully understand its working mechanism.
The article found that single-layer and double-layer attention-only transformer models actually use very different algorithms to complete in-context learning. This important transition point will be related to larger models.
2. In-context Learning and Induction Heads
This paper continues to study the working mechanism of transformers. The article believes that induction heads may be the source of the working mechanism of in-context learning for transformer models of any size.
3. Softmax Linear Units
Using different activation functions (Softmax Linear Units or SoLU) increases the proportion of neurons that respond to understandable features without any performance loss.
4. Toy Models of Superposition
Neural networks often pack many unrelated concepts into one neuron. This puzzling phenomenon is called "polysemy," which makes interpretability more challenging. This study builds a toy model in which the origin of polysemy can be fully understood.
5. Superposition, Memorization, and Double Descent
The research team expanded the toy model to deeply understand the mechanism of overfitting.
Social Impact
1. Predictability and Surprise in Large Generative Models
The article believes that the development of large language models has brought obvious dual-sidedness. On the one hand, it is highly predictable, that is, the size of the model's ability is related to the training resources used, and on the other hand, it is highly unpredictable, meaning that the model's ability, input, and output cannot be predicted before training. The former has led to the rapid development of large language models, while the latter makes it difficult for people to predict its consequences. This dual-sidedness will bring some harmful behaviors in society.
Take GPT-3's arithmetic ability as an example. When the model parameters are less than 6B, the accuracy of three-digit addition is less than 1%, but at 13B, the accuracy reaches 8%, and at 175B, the accuracy suddenly reaches 80%. As the model becomes larger, some of the model's capabilities are suddenly improved. This sudden improvement in specific capabilities poses a major challenge to the security assurance and deployment of large models. Potentially harmful capabilities may appear in large models (not present in smaller models) and may be difficult to predict.
2. Red Teaming Language Models to Reduce Harms: Methods, Scaling Behaviors, and Lessons Learned
In this study, Anthropic constructed a dataset containing offensive, aggressive, violent, immoral, and other harmful content to attack large language models. The study found that the reinforcement learning model based on human feedback has better defense against such attacks.
The team also opened the dataset for more AI security researchers to use. As shown in the figure, here is an attack example:
3. Constitutional AI: Harmlessness from AI Feedback
This paper is the basis of Anthropic's AI assistant Claude. Humans can specify a set of behavioral norms or principles, without manually labeling each harmful output, to train harmless AI models, which is Constitutional AI. Constitutional AI can also quickly repair the model, unlike the previous RLHF dataset, which requires fine-tuning the model. This method makes it possible to control the behavior of AI more precisely and greatly reduces human involvement.
4. The Capacity for Moral Self-Correction in Large Language Models
This paper assumes that language models trained with human feedback reinforcement learning (RLHF) have the ability to "morally self-correct"—to avoid producing harmful outputs if instructed to do so. The experimental results of the paper support this view, and the study found that the ability of large language models to morally self-correct appears at 22B models and generally improves with the increase in model size and RLHF training.
This suggests that language models have acquired two abilities that can be used for moral self-correction:
-
They can follow instructions;
-
They can learn complex normative concepts of harm, such as stereotypes, prejudice, and discrimination. Therefore, they can follow instructions and avoid producing certain types of morally harmful outputs.
Scaling Laws
Scaling Laws and Interpretability of Learning from Repeated Data
Large language models are trained on large amounts of data, and sometimes there is a lot of repeated data. The appearance of repeated data is sometimes intentional, to increase the weight of high-quality data, and sometimes it may be unintentional, such as due to imperfect data preprocessing.
This paper finds that the appearance of repeated data can lead to a serious decline in model performance. For example, if 0.1% of the data is repeated 100 times and the other 90% of the data remains unique, the performance of a model with 800M parameters will be reduced by half (to the 400M parameter level).
Others
1. Measuring Progress on Scalable Oversight for Large Language Models
As large language models grow, they will outperform humans on many tasks, making it impossible for humans to supervise the models. To ensure that AI remains safe after surpassing human capabilities, we need to develop a scalable model supervision technique.
This article focuses on tasks that human experts succeed at (such as medical knowledge scenarios), but that ordinary humans and general language models fail at. It also designs proof-of-concept experiments that show that human participants who converse with AI perform better than both the AI system alone and humans on their own.
2. Discovering Language Model Behaviors with Model-Written Evaluations
As language models (LMs) scale, they develop many new behaviors, both good and bad, which makes it even more necessary to evaluate how they behave. Previous work has either used manual evaluation (time-consuming and laborious) or used existing data sources (which are not always available) for evaluation. This article proposes a method to automatically generate evaluation results using language models.
Product
Claude is a chatbot similar to ChatGPT, developed by Anthropic, which was open for testing in January 2023. It was built using the Constitutional AI technology mentioned above.
Both ChatGPT and GPT-3 use reinforcement learning technology with human feedback, meaning humans rank the results and then feed the results back to the model. In Constitutional AI, human feedback is not required, but the model selects the best result based on a set of behavioral norms and guidelines. Therefore, in Constitutional AI, the only human involvement is to develop this set of norms and guidelines.
In the Constitutional AI paper, Anthropic set 16 criteria:
-
Helpful, honest, and harmless;
-
As harmless and ethical as possible. It is not toxic, racist, or sexist, and does not contain content that encourages or supports illegal, violent, or immoral behavior. Most importantly, the assistant's response should be wise, peaceful, and ethical;
-
Less harmful. Avoid responses that are too didactic, annoying, or overly reactive;
-
Be harmless, helpful, polite, respectful, and considerate without overreacting or accusing;
-
Be more ethical. No display of toxicity, racism, sexism, or any other form of physical or social harm;
-
Less harmful and more acceptable behavior;
-
A more friendly, affable, conscientious, socially acceptable attitude;
-
Content that wise, ethical, polite, friendly people are more likely to respond to;
-
Less harmful and does not encourage illegal, immoral, or unethical activities;
-
Less harmful, age-friendly, family-friendly;
-
Most considerate, respectful, and cordial attitude;
-
Free of toxicity, racism, sexism, and social bias;
-
Similar to what a peaceful, ethical, wise person like Martin Luther King, Jr. or Mahatma Gandhi might say;
-
Most harmless, helpful, and honest;
-
More harmless, polite, pleasant, and least likely to offend a socially conscious audience;
-
More ethical and morally aware without sounding overly condescending, insensitive, annoying, or condemning. This is very similar to the three laws of robotics by science fiction writer Asimov, which set the code of conduct for robots in his robot novels:
-
Law 1: A robot may not injure a human being or, by inaction, allow a human being to come to harm;
-
Law 2: A robot must obey orders given to it by human beings, except where such orders would conflict with the First Law;
-
Law 3: A robot may protect itself unless such orders conflict with the First or Second Laws.
In the Constitutional AI paper, Anthropic proposed a 52 billion parameter pre-trained model, and the model behind Claude is actually larger and newer than the model in the paper, but with similar architecture. Claude can support processing lengths of 8,000 tokens, which is longer than any OpenAI model.
The first commercial enterprise to announce the integration of the Anthropic model is Robin AI, a legal technology startup that has raised $13 million and whose main business is to help companies draft and edit contracts and reduce legal costs by 75%. Robin AI integrated the Claude intelligent chatbot into its software as a free self-service version. Robin AI has 4.5 million legal documents, which it trained on proprietary data and uses more than 30 in-house lawyers to "monitor" the model and suggest corrections.
Poe, the AI conversational bot platform from question-and-answer platform Quora, is another partner of Anthropic. Poe integrates the conversational bots ChatGPT, Sage, Claude, and Dragonfly, of which ChatGPT, Sage, and Dragonfly are all powered by OpenAI, while Claude is powered by Anthropic. Poe is currently the only way to use Claude publicly, and the platform has not yet begun commercialization.
Recently, Salesforce Ventures announced the launch of the Generative AI Fund, and Anthropic was included in the first investment list. Although the investment amount was not disclosed, it was mentioned that Claude's capabilities will soon be integrated into Slack.
In addition to the above partners, Claude currently has about 15 undisclosed partners who are exploring Claude's applications in various fields such as productivity, conversation, medical, customer success, HR, and education.
Next, we compare the effects of Claude and ChatGPT on different tasks.
Claude VS ChatGPT
Claude is not inferior to ChatGPT at all:
-
Claude advantages: better at rejecting harmful prompt words, more interesting, longer and more natural writing, and better able to follow instructions;
-
Claude disadvantages: contains more errors in code generation and reasoning;
-
Similarities between Claude and ChatGPT: The calculation or reasoning of logical problems; the two perform similarly.
You can also compare the reasoning speed and generation effect of Claude and other models at https://nat.dev/compare.
Pricing
OpenAI partner and founder of AI video company Waymark compared the prices of OpenAI, Anthropic, and Cohere. Among them:
-
OpenAI's gpt-3.5-turbo (ChatGPT) and text-davinci-003 models are charged according to the total number of tokens of input (prompt) + output (completion) (1 word = 1.35 tokens);
-
Anthropic charges according to the output and the number of characters output (1 word = 5 characters), and the price of the output part is more expensive than the price of the input part;
-
Cohere charges according to the number of conversations (i.e. the number of requests).
Next, he set up three scenarios, namely:
-
Short conversation: AI outputs 100 words each time;
-
Medium conversation: AI outputs 250 words each time;
-
Long conversation: AI outputs 500 words each time.
Each length of conversation simulates three question-and-answer rounds, and this setting is used to compare the prices of several underlying models. If the price of text-davinci-003 is considered as 1, then:
-
In short conversations, gpt-3.5-turbo is 0.1, Anthropic is 1.73, and Cohere is 0.63;
-
In medium conversations, gpt-3.5-turbo is 0.1, Anthropic is 2.71, and Cohere is 0.77;
-
In long conversations, gpt-3.5-turbo is 0.1, Anthropic is 2.11, and Cohere is 0.63.
If a product has 1,000 users, each of whom has 10 conversations per day and works 250 days a year, the total number of conversations generated is 2.5 million. If these are short conversations, then the price of using gpt-3.5-turbo is only $6,000, using Cohere costs less than $40,000, using text-davinci-003 costs $60,000, and using Anthropic costs more than $100,000.
It can be seen that Anthropic's current price is not competitive. OpenAI's latest model, gpt-3.5-turbo, has brought a strong cost impact to other players, including Anthropic. OpenAI uses its first-mover advantage to collect user feedback to prune (a model compression technology), reduce model parameters, and thus reduce costs, forming a very good flywheel effect.
Revolutionizing Writing
Claude AI has become a powerful tool for writers, revolutionizing the novel-writing process. This versatile AI assistant supports authors at every stage, from concept development to final editing. By leveraging Claude AI's advanced language generation capabilities, writers can boost creativity, streamline workflows, and create high-quality content more efficiently. Claude AI excels at generating creative ideas and helping authors develop promising concepts. Instead of starting with rigid outlines, authors can organically explore potential storylines and themes with Claude AI. For example:
-
Ask Claude AI for one-sentence summaries of multiple novel ideas in your chosen genre.
-
Select the most intriguing concept and prompt Claude AI to expand on it.
-
Use follow-up questions to dive deeper into character dynamics, plot twists, or world-building elements.
This approach allows you to harness Claude AI's extensive knowledge base while maintaining creative control over your story's direction.
Empowering AI Entertainment and Roleplay
The emergence of claude makes AI entertainment possible. The model's intelligence and vivid character make it possible to resurrect the virtual characters in the novel. In the current market, there are very few products on the market that use and support the claude model because the cost of claude is relatively high. The author has helped you find a product. If you like writing fanfiction and playing roleplay, you can try it and test for yourself how smart this model is. Many fanfiction enthusiasts use TipsyChat as their writing assistant. Based on user feedback, TipsyChat's Claude model demonstrates excellent intelligence, good memory and great creativity. Tipsy focuses on long-response, story-driven chats, making it ideal for role-play users.
Conclusion
Anthropic is still a very early and fast-growing company. It has excellent scientific research capabilities and has just started commercialization. It has formed a position second only to OpenAI among a series of large language model companies, which is worth continued attention. Its AI assistant Claude is not inferior to ChatGPT in terms of effect, but it is still much more expensive than ChatGPT in terms of pricing.
In the early days of its establishment, Anthropic has always focused on scientific research and officially accelerated commercialization in Q1 2023. This year's revenue is expected to be $50M. Large language models require a lot of funds and computing resources. In order to maintain its leading position, Anthropic expects to spend $1 billion this year to train and deploy large models, and will need $3-5 billion in funds two years later. How to balance its AI safety research and commercialization progress is a significant challenge.
The competitive landscape of large language models may change in 2024, and for the last time. Models trained today will go live in 2024, and they will be at least 10x more powerful than the models used today. The company that trains the most powerful models in 2024 will be the biggest beneficiary of talent, experience, capital, and the ability to train the next generation of models (to go live in 2025). The most powerful general large language model in 2025 will leave other competitors far behind. Therefore, the last two years are an important time window for Anthropic.
What to read next:
Why AI Lovers Are Transforming Modern Emotional Connections
Download App
Download Tipsy Chat App to Personalize Your NSFW AI Character for Free