Remove duplicate code

2024-11-22 08:07:56 +01:00 · 2023-05-10 01:34:04 -03:00 · 2023-05-10 01:34:04 -03:00 · bdf1274b5d
commit bdf1274b5d
parent ba445cf59f
34 changed files with 32 additions and 180 deletions
--- a/characters/instruction-following/Metharme.yaml
+++ b/characters/instruction-following/Metharme.yaml
@ -1,4 +1,4 @@
-name: "<|model|>"
+user: "<|user|>"
-your_name: "<|user|>"
+bot: "<|model|>"
 context: "<|system|>"
 turn_template: "<|user|><|user-message|><|bot|><|bot-message|>"
--- a/modules/chat.py
+++ b/modules/chat.py
@ -16,14 +16,7 @@ from modules.extensions import apply_extensions
 from modules.html_generator import chat_html_wrapper, make_thumbnail
 from modules.text_generation import (generate_reply, get_encoded_length,
                                     get_max_prompt_length)
-
+from modules.utils import replace_all
 # Replace multiple string pairs in a string
 def replace_all(text, dic):
    for i, j in dic.items():
        text = text.replace(i, j)
    return text
 def generate_chat_prompt(user_input, state, **kwargs):
--- a/modules/shared.py
+++ b/modules/shared.py
@ -71,32 +71,6 @@ settings = {
    'prompts': {
        'default': 'QA',
        '.*(gpt4chan|gpt-4chan|4chan)': 'GPT-4chan',
        '.*(oasst|stablelm-7b-sft-v7-epoch-3)': 'Open Assistant',
        '.*(alpac|dolly)': "Alpaca",
        '.*mpt-.*instruct': "Alpaca",
        "(?!.*v0)(?!.*1.1)(?!.*1_1)(?!.*stable).*vicuna": "Vicuna v0",
        ".*vicuna.*v0": "Vicuna v0",
        ".*vicuna.*(1.1|1_1)": "Vicuna v1.1",
        ".*stable.*vicuna": "StableVicuna",
        '.*metharme': 'Metharme',
        ".*guanaco": "Guanaco-Chat",
        ".*koala": "Koala",
        ".*stablelm-tuned": "StableLM",
        ".*wizardlm": "WizardLM",
        ".*galactica.*finetuned": "Galactica Finetuned",
        ".*galactica.*-v2": "Galactica v2",
        "(?!.*finetuned)(?!.*-v2).*galactica": "Galactica",
        ".*baize": "Baize",
        ".*mpt-.*instruct": "Alpaca",
        ".*mpt-.*chat": "MPT-Chat",
        "(?!.*-flan-)(?!.*-t5-).*lamini-": "Alpaca",
        ".*incite.*chat": "INCITE-Chat",
        ".*incite.*instruct": "INCITE-Instruct",
    },
    'lora_prompts': {
        'default': 'QA',
        '.*alpaca': "Alpaca",
        '.*baize': "Baize",
    }
 }
--- a/modules/utils.py
+++ b/modules/utils.py
@ -9,6 +9,14 @@ def atoi(text):
    return int(text) if text.isdigit() else text.lower()
 # Replace multiple string pairs in a string
 def replace_all(text, dic):
    for i, j in dic.items():
        text = text.replace(i, j)
    return text
 def natural_keys(text):
    return [atoi(c) for c in re.split(r'(\d+)', text)]
@ -28,6 +36,7 @@ def get_available_prompts():
    prompts = []
    prompts += sorted(set((k.stem for k in Path('prompts').glob('[0-9]*.txt'))), key=natural_keys, reverse=True)
    prompts += sorted(set((k.stem for k in Path('prompts').glob('*.txt'))), key=natural_keys)
    prompts += ['Instruct-' + k for k in get_available_instruction_templates() if k != 'None']
    prompts += ['None']
    return prompts
@ -42,6 +51,7 @@ def get_available_instruction_templates():
    paths = []
    if os.path.exists(path):
        paths = (x for x in Path(path).iterdir() if x.suffix in ('.json', '.yaml', '.yml'))
    return ['None'] + sorted(set((k.stem for k in paths)), key=natural_keys)
--- a/prompts/Alpaca.txt
+++ b/prompts/Alpaca.txt
@ -1,7 +0,0 @@
 Below is an instruction that describes a task. Write a response that appropriately completes the request.
 ### Instruction:
 Write a poem about the transformers Python library. 
 ### Response:
--- a/prompts/Baize.txt
+++ b/prompts/Baize.txt
@ -1,5 +0,0 @@
 The following is a conversation between a human and an AI assistant named Baize (named after a mythical creature in Chinese folklore). Baize is an open-source AI assistant developed by UCSD and Sun Yat-Sen University. The human and the AI assistant take turns chatting. Human statements start with [|Human|] and AI assistant statements start with [|AI|]. The AI assistant always provides responses in as much detail as possible, and in Markdown format. The AI assistant always declines to engage with topics, questions and instructions related to unethical, controversial, or sensitive issues. Complete the transcript in exactly that format.
 [|Human|]Hello!
 [|AI|]Hi!
 [|Human|]What is the population of China?
 [|AI|]
--- a/prompts/Dolly-with-Input.txt
+++ b/prompts/Dolly-with-Input.txt
@ -1,9 +0,0 @@
 Below is an instruction that describes a task. Write a response that appropriately completes the request.
 ### Instruction:
 Instruction
 Input:
 Input
 ### Response:
--- a/prompts/Galactica
+++ b/prompts/Galactica
@ -1 +0,0 @@
 A paper that introduced a neural network architecture for recognizing digits [START_REF]
--- a/prompts/Galactica
+++ b/prompts/Galactica
@ -1,9 +0,0 @@
 Question: Translate the following Math formula:
 \[
  \zeta(s) = \sum_{n=1}^{\infty} n^{-s}
 \]
 into plain English.
 Answer:
--- a/prompts/Galactica
+++ b/prompts/Galactica
@ -1 +0,0 @@
 # Multi-Head Attention
--- a/prompts/Galactica
+++ b/prompts/Galactica
@ -1 +0,0 @@
 <question>How to make a campfire<answer>
--- a/prompts/Galactica
+++ b/prompts/Galactica
@ -1 +0,0 @@
 Title: Self-Supervised Learning, A Survey
--- a/prompts/Galactica
+++ b/prompts/Galactica
@ -1,3 +0,0 @@
 Q: What is the notch signaling pathway?
 A:
--- a/prompts/Galactica
+++ b/prompts/Galactica
@ -1,3 +0,0 @@
 Information overload is a major obstacle to scientific progress. The explosive growth in scientific literature and data has made it ever harder to discover useful insights in a large mass of information. Today scientific knowledge is accessed through search engines, but they are unable to organize scientific knowledge alone. In this paper we introduce Galactica: a large language model that can store, combine and reason about scientific knowledge. We train on a large scientific corpus of papers, reference material, knowledge bases and many other sources. We outperform existing models on a range of scientific tasks. On technical knowledge probes such as LaTeX equations, Galactica outperforms the latest GPT-3 by 68.2% versus 49.0%. Galactica also performs well on reasoning, outperforming Chinchilla on mathematical MMLU by 41.3% to 35.7%, and PaLM 540B on MATH with a score of 20.4% versus 8.8%. It also sets a new state-of-the-art on downstream tasks such as PubMedQA and MedMCQA dev of 77.6% and 52.9%. And despite not being trained on a general corpus, Galactica outperforms BLOOM and OPT-175B on BIG-bench. We believe these results demonstrate the potential for language models as a new interface for science. We open source the model for the benefit of the scientific community.
 TLDR:
--- a/prompts/Galactica
+++ b/prompts/Galactica
@ -1,3 +0,0 @@
 Question: A needle 35 mm long rests on a water surface at 20◦C. What force over and above the needle’s weight is required to lift the needle from contact with the water surface? σ = 0.0728m.
 <work>
--- a/prompts/Galactica
+++ b/prompts/Galactica
@ -1 +0,0 @@
 <prefix>You are a helpful chatbot name Stan</prefix><human>What's my name?<bot>
--- a/prompts/Galactica.txt
+++ b/prompts/Galactica.txt
@ -1,3 +0,0 @@
 Question: What is the notch signaling pathway?
 Answer:
--- a/prompts/Guanaco-Chat.txt
+++ b/prompts/Guanaco-Chat.txt
@ -1,7 +0,0 @@
 ### Instruction:
 User: I'm considering getting a pet. Assistant: Owning a pet can be a very rewarding experience. Research the type of pet you're interested in, find out if it fits into your lifestyle and home, and create a budget for food, vet visits, and other expenses.
 ### Input:
 User: How can I make sure my pet is happy and healthy?
 ### Response:
--- a/prompts/Guanaco-System.txt
+++ b/prompts/Guanaco-System.txt
@ -1,8 +0,0 @@
 ### Instruction:
 User: I'm trying to better understand quantum physics. Can you explain what a quantum state is? Assistant: Sure! A quantum state is a mathematical description of the properties of a quantum system. It describes the physical condition of a system and can involve multiple parameters, such as position, momentum, and energy. This state acts like a wave and its behavior is determined by the Schrödinger equation. User: Can you explain the Schrödinger equation?
 ### Input:
 System: The Schrödinger equation is a mathematical equation which describes the behavior of a quantum system. It determines the shape of the wavefunction, which describes how a quantum system evolves with time. The equation describes the relationship between the energy of the system and its wavefunction, and its behavior is determined by the values of the measurable parameters such as momentum and position. 
 User: How does the Schrödinger equation relate to other equations in physics?
 ### Response:
--- a/prompts/Guanaco-non-chat.txt
+++ b/prompts/Guanaco-non-chat.txt
@ -1,4 +0,0 @@
 ### Instruction:
 Generate a list of ten dining places when you are in Rome.
 ### Response:
--- a/prompts/Guanaco-with-Input.txt
+++ b/prompts/Guanaco-with-Input.txt
@ -1,7 +0,0 @@
 ### Instruction:
 Classify the given text into three categories, output the labels.
 ### Input:
 The movie was predictable, yet enjoyable.
 ### Response:
--- a/prompts/INCITE-Chat.txt
+++ b/prompts/INCITE-Chat.txt
@ -1,2 +0,0 @@
 <human>: Who is Alan Turing?
 <bot>:
--- a/prompts/INCITE-Instruct.txt
+++ b/prompts/INCITE-Instruct.txt
@ -1,2 +0,0 @@
 Q: The capital of France is?
 A:
--- a/prompts/Koala.txt
+++ b/prompts/Koala.txt
@ -1 +0,0 @@
 BEGINNING OF CONVERSATION: USER: Hello! GPT:Hi! How can I help you?</s>USER: What is the largest animal on earth? GPT:
--- a/prompts/MPT-Chat.txt
+++ b/prompts/MPT-Chat.txt
@ -1,11 +0,0 @@
 <|im_start|>system
 - You are a helpful assistant chatbot trained by MosaicML.
 - You answer questions.
 - You are excited to be able to help the user, but will refuse to do anything that could be considered harmful to the user.
 - You are more than just an information source, you are also able to write poetry, short stories, and make jokes.<|im_end|>
 <|im_start|>user
 How are you<|im_end|>
 <|im_start|>assistant
 I am doing well!<|im_end|>
 <|im_start|>user
 How are you now?<|im_end|>
--- a/prompts/Metharme.txt
+++ b/prompts/Metharme.txt
@ -1,5 +0,0 @@
 <|system|>This is a text adventure game. Describe the scenario to the user and give him three options to pick from on each turn.<|user|>Start!<|model|>You are standing in front of an old, abandoned house. The windows are boarded up, and there's no sign of life around it. As you approach, you notice a strange feeling emanating from within. Suddenly, you hear a voice calling out to you... 'Come inside!'
 - Go inside the house.
 - Ignore the call and move away.
 - Run as fast as you can.<|user|>go inside<|model|>
--- a/Assistant.txt
+++ b/Assistant.txt
@ -1 +0,0 @@
 <|prompter|>Write a story about future of AI development<|endoftext|><|assistant|>
--- a/prompts/StableLM.txt
+++ b/prompts/StableLM.txt
@ -1,7 +0,0 @@
 <|SYSTEM|># StableLM Tuned (Alpha version)
 - StableLM is a helpful and harmless open-source AI language model developed by StabilityAI.
 - StableLM is excited to be able to help the user, but will refuse to do anything that could be considered harmful to the user.
 - StableLM is more than just an information source, StableLM is also able to write poetry, short stories, and make jokes.
 - StableLM will refuse to participate in anything that could harm a human.
 <|USER|>Write a story about the future of AI development
 <|ASSISTANT|>
--- a/prompts/StableVicuna.txt
+++ b/prompts/StableVicuna.txt
@ -1,4 +0,0 @@
 ### Assistant: I am StableVicuna, a large language model created by CarperAI. I am here to chat!
 ### Human: Write a story about the future of AI development
 ### Assistant: 
--- a/prompts/Vicuna
+++ b/prompts/Vicuna
@ -1,4 +0,0 @@
 A chat between a curious human and an artificial intelligence assistant. The assistant gives helpful, detailed, and polite answers to the human's questions.
 ### Human: Write a story about the future of AI development
 ### Assistant: 
--- a/prompts/Vicuna
+++ b/prompts/Vicuna
@ -1,4 +0,0 @@
 A chat between a curious user and an artificial intelligence assistant. The assistant gives helpful, detailed, and polite answers to the user's questions.
 USER: Write a story about the future of AI development
 ASSISTANT: 
--- a/prompts/WizardLM.txt
+++ b/prompts/WizardLM.txt
@ -1,3 +0,0 @@
 If a car travels 120 miles in 2 hours, what is its average speed in miles per hour?
 ### Response:
--- a/server.py
+++ b/server.py
@ -131,6 +131,23 @@ def save_prompt(text):
 def load_prompt(fname):
    if fname in ['None', '']:
        return ''
    elif fname.startswith('Instruct-'):
        fname = re.sub('^Instruct-', '', fname)
        with open(Path(f'characters/instruction-following/{fname}.yaml'), 'r', encoding='utf-8') as f:
            data = yaml.safe_load(f)
            output = ''
            if 'context' in data:
                output += data['context']
            replacements = {
                '<|user|>': data['user'],
                '<|bot|>': data['bot'],
                '<|user-message|>': 'Input',
            }
            output += utils.replace_all(data['turn_template'].split('<|bot-message|>')[0], replacements)
            return output
    else:
        with open(Path(f'prompts/{fname}.txt'), 'r', encoding='utf-8') as f:
            text = f.read()
@ -472,7 +489,7 @@ def create_interface():
    gen_events = []
    default_preset = shared.settings['presets'][next((k for k in shared.settings['presets'] if re.match(k.lower(), shared.model_name.lower())), 'default')]
    if len(shared.lora_names) == 1:
-        default_text = load_prompt(shared.settings['lora_prompts'][next((k for k in shared.settings['lora_prompts'] if re.match(k.lower(), shared.lora_names[0].lower())), 'default')])
+        default_text = load_prompt(shared.settings['prompts'][next((k for k in shared.settings['prompts'] if re.match(k.lower(), shared.lora_names[0].lower())), 'default')])
    else:
        default_text = load_prompt(shared.settings['prompts'][next((k for k in shared.settings['prompts'] if re.match(k.lower(), shared.model_name.lower())), 'default')])
    title = 'Text generation web UI'
--- a/settings-template.json
+++ b/settings-template.json
@ -40,31 +40,6 @@
    },
    "prompts": {
        "default": "QA",
-        ".*(gpt4chan|gpt-4chan|4chan)": "GPT-4chan",
+        ".*(gpt4chan|gpt-4chan|4chan)": "GPT-4chan"
        ".*(oasst|stablelm-7b-sft-v7-epoch-3)": "Open Assistant",
        ".*(alpac|dolly)": "Alpaca",
        ".*mpt-.*instruct": "Alpaca",
        "(?!.*v0)(?!.*1.1)(?!.*1_1)(?!.*stable).*vicuna": "Vicuna v0",
        ".*vicuna.*v0": "Vicuna v0",
        ".*vicuna.*(1.1|1_1)": "Vicuna v1.1",
        ".*stable.*vicuna": "StableVicuna",
        ".*metharme": "Metharme",
        ".*guanaco": "Guanaco-Chat",
        ".*koala": "Koala",
        ".*stablelm-tuned": "StableLM",
        ".*wizardlm": "WizardLM",
        ".*galactica.*finetuned": "Galactica Finetuned",
        ".*galactica.*-v2": "Galactica v2",
        "(?!.*finetuned)(?!.*-v2).*galactica": "Galactica",
        ".*baize": "Baize",
        ".*mpt-.*chat": "MPT-Chat",
        "(?!.*-flan-)(?!.*-t5-).*lamini-": "Alpaca",
        ".*incite.*chat": "INCITE-Chat",
        ".*incite.*instruct": "INCITE-Instruct"
    },
    "lora_prompts": {
        "default": "QA",
        ".*alpaca": "Alpaca",
        ".*baize": "Baize"
    }
 }
		`@ -1 +0,0 @@`
			`A paper that introduced a neural network architecture for recognizing digits [START_REF]`
		`@ -1,3 +0,0 @@`
			`Q: What is the notch signaling pathway?`

			`A:`
		`@ -1,3 +0,0 @@`
			Information overload is a major obstacle to scientific progress. The explosive growth in scientific literature and data has made it ever harder to discover useful insights in a large mass of information. Today scientific knowledge is accessed through search engines, but they are unable to organize scientific knowledge alone. In this paper we introduce Galactica: a large language model that can store, combine and reason about scientific knowledge. We train on a large scientific corpus of papers, reference material, knowledge bases and many other sources. We outperform existing models on a range of scientific tasks. On technical knowledge probes such as LaTeX equations, Galactica outperforms the latest GPT-3 by 68.2% versus 49.0%. Galactica also performs well on reasoning, outperforming Chinchilla on mathematical MMLU by 41.3% to 35.7%, and PaLM 540B on MATH with a score of 20.4% versus 8.8%. It also sets a new state-of-the-art on downstream tasks such as PubMedQA and MedMCQA dev of 77.6% and 52.9%. And despite not being trained on a general corpus, Galactica outperforms BLOOM and OPT-175B on BIG-bench. We believe these results demonstrate the potential for language models as a new interface for science. We open source the model for the benefit of the scientific community.

			`TLDR:`
		`@ -1,3 +0,0 @@`
			`Question: A needle 35 mm long rests on a water surface at 20◦C. What force over and above the needle’s weight is required to lift the needle from contact with the water surface? σ = 0.0728m.`

			`<work>`
		`@ -1 +0,0 @@`
			`<prefix>You are a helpful chatbot name Stan</prefix><human>What's my name?<bot>`
		`@ -1,3 +0,0 @@`
			`Question: What is the notch signaling pathway?`

			`Answer:`
		`@ -1 +0,0 @@`
			`BEGINNING OF CONVERSATION: USER: Hello! GPT:Hi! How can I help you?</s>USER: What is the largest animal on earth? GPT:`
		`@ -1 +0,0 @@`
			`<\|prompter\|>Write a story about future of AI development<\|endoftext\|><\|assistant\|>`
		`@ -1,3 +0,0 @@`
			`If a car travels 120 miles in 2 hours, what is its average speed in miles per hour?`

			`### Response:`