"Defines how the chat prompt is generated. In instruct and chat-instruct modes, the instruction template Parameters > Instruction template is used.":"定义如何生成聊天提示词。在 指令 和 聊天指令 模式下,默认使用 参数 > 指令模板 下选择的指令模板。",
"chat":"聊天",
"chat-instruct":"聊天指令",
"instruct":"指令",
"Chat style":"聊天界面主题",
"Command for chat-instruct mode":"聊天指令模式下的指令",
"<|character|> and <|prompt|> get replaced with the bot name and the regular chat prompt respectively.":"“<|character|>”和“<|prompt|>”分别会被替换成机器人名称和常规聊天提示词。",
"Set to greater than 0 to enable DRY. Recommended value: 0.8.":"将值设为大于零以启用DRY。推荐值:0.8。",
"dry_allowed_length":"DRY允许重复的序列长度",
"Longest sequence that can be repeated without being penalized.":"可免于被惩罚的最长重复序列。",
"dry_base":"DRY基数",
"Controls how fast the penalty grows with increasing sequence length.":"控制随着重复的序列的长度增长,惩罚的增长有多快。",
"dry_sequence_breakers":"DRY序列匹配中断符",
"Tokens across which sequence matching is not continued. Specified as a comma-separated list of quoted strings.":"这些词符会打断并分隔序列的匹配。该参数以逗号分隔的引号字符串列表形式指定。",
"Learn more":"了解更多",
"Expand max_new_tokens to the available context length.":"将最大新词符数扩展到可用的上下文长度。",
"auto_max_new_tokens":"自动确定最大新词符数",
"Forces the model to never end the generation prematurely.":"强制模型永不过早结束生成。",
"Ban the eos_token":"禁用序列终止符",
"Disabling this can make the replies more creative.":"禁用此项可以使回复更加具有创造性。",
"Add the bos_token to the beginning of prompts":"在提示词开头添加序列起始符",
"Custom stopping strings":"自定义停止字符串",
"Written between \"\" and separated by commas.":"用英文半角逗号分隔,用\"\"包裹。",
"Token bans":"禁用词符",
"Token IDs to ban, separated by commas. The IDs can be found in the Default or Notebook tab.":"填入要禁用的词符ID,用英文半角逗号分隔。你可以在默认或笔记本标签页获得词符的ID。",
"penalty_alpha":"惩罚系数α",
"For Contrastive Search. do_sample must be unchecked.":"用于对比搜索,必须取消勾选“使用采样算法”",
"guidance_scale":"指导比例",
"For CFG. 1.5 is a good value.":"用于CFG,1.5是个不错的值。",
"Negative prompt":"负面提示词",
"mirostat_mode":"mirostat模式",
"mode=1 is for llama.cpp only.":"模式1仅适用于llama.cpp。",
"mirostat_tau":"mirostat参数τ",
"mirostat_eta":"mirostat参数η",
"epsilon_cutoff":"ε截断",
"eta_cutoff":"η截断",
"encoder_repetition_penalty":"编码器重复惩罚",
"no_repeat_ngram_size":"禁止重复的N元语法元数",
"Load grammar from file (.gbnf)":"从.gbnf文件加载语法",
"Grammar":"语法",
"tfs":"无尾采样超参数",
"top_a":"Top A",
"smoothing_factor":"平滑因子",
"Activates Quadratic Sampling.":"激活二次采样。",
"smoothing_curve":"平滑曲线",
"Adjusts the dropoff curve of Quadratic Sampling.":"调整二次采样的衰减曲线。",
"dynamic_temperature":"动态温度",
"dynatemp_low":"动态温度最小值",
"dynatemp_high":"动态温度最大值",
"dynatemp_exponent":"动态温度指数",
"Moves temperature/dynamic temperature/quadratic sampling to the end of the sampler stack, ignoring their positions in \"Sampler priority\".":"将温度/动态温度/二次采样移至采样器堆栈的末端,忽略它们在“采样器优先级”中的位置。",
"temperature_last":"温度采样放最后",
"Sampler priority":"采样器优先级",
"Parameter names separated by new lines or commas.":"参数名用新行或逗号分隔。",
"Truncate the prompt up to this length":"将提示词截断至此长度",
"The leftmost tokens are removed if the prompt exceeds this length. Most models require this to be at most 2048.":"如果提示词超出这个长度,最左边的词符将被移除。大多数模型要求这个长度最多为2048。",
"prompt_lookup_num_tokens":"提示词查找解码词符数",
"Activates Prompt Lookup Decoding.":"启用提示词查找解码。",
"Maximum tokens/second":"每秒最多词符数",
"To make text readable in real time.":"用它使文本实时可读。",
"Maximum UI updates/second":"每秒最大UI刷新次数",
"Set this if you experience lag in the UI during streaming.":"如果你在流式输出时感到UI卡顿,可以调整此设置。",
"Seed (-1 for random)":"种子(-1表示随机)",
"Some specific models need this unset.":"有些特定的模型需要取消这个设置。",
"Skip special tokens":"跳过特殊词符",
"Activate text streaming":"激活文本流式输出",
"Character":"角色",
"User":"用户",
"Chat history":"聊天记录",
"Upload character":"上传角色",
"Used in chat and chat-instruct modes.":"用在聊天和聊天指令模式下。",
"Character's name":"角色的名字",
"Context":"背景",
"Greeting":"开场白",
"Name":"名字",
"Description":"描述",
"Here you can optionally write a description of yourself.":"您可以在这里写下有关您自己的描述。",
"gpu-memory in MiB for device :0":"GPU内存(MiB)设备:0",
"cpu-memory in MiB":"CPU内存(MiB)",
"load-in-4bit params:":"以4位量化加载参数:",
"compute_dtype":"计算数据类型",
"quant_type":"量化类型",
"hqq_backend":"HQQ后端",
"n-gpu-layers":"GPU层数",
"Must be set to more than 0 for your GPU to be used.":"必须要设为大于0的值,你的GPU才会被使用。",
"n_ctx":"上下文大小",
"Context length. Try lowering this if you run out of memory while loading the model.":"上下文长度。如果在加载模型时内存不足,请尝试降低此值。",
"tensor_split":"张量分割",
"List of proportions to split the model across multiple GPUs. Example: 60,40":"将模型分割到多个GPU的比例列表。示例:60,40",
"n_batch":"批处理大小",
"threads":"线程数",
"threads_batch":"批处理线程数",
"wbits":"权重位数",
"groupsize":"组大小",
"gpu-split":"GPU分割",
"Comma-separated list of VRAM (in GB) to use per GPU. Example: 20,7,7":"以逗号分隔的每个GPU使用的VRAM(以GB为单位)列表。示例:20,7,7",
"max_seq_len":"最大序列长度",
"alpha_value":"alpha值",
"Positional embeddings alpha factor for NTK RoPE scaling. Recommended values (NTKv1): 1.75 for 1.5x context, 2.5 for 2x context. Use either this or compress_pos_emb, not both.":"NTK RoPE缩放的位置嵌入alpha因子。推荐值(NTKv1):1.5倍上下文长度用1.75,2倍上下文长度用2.5。使用此项或压缩位置嵌入,不要同时使用。",
"rope_freq_base":"rope频率基数",
"Positional embeddings frequency base for NTK RoPE scaling. Related to alpha_value by rope_freq_base = 10000 * alpha_value ^ (64 / 63). 0 = from model.":"用于NTK RoPE缩放的位置嵌入频率基数。它和alpha值的关系是 rope频率基数 = 10000 * alpha值 ^ (64 / 63)。此值设为0表示使用模型自带的该参数。",
"compress_pos_emb":"压缩位置嵌入",
"Positional embeddings compression factor. Should be set to (context length) / (model's original context length). Equal to 1/rope_freq_scale.":"位置嵌入的压缩因子。应设置为(上下文长度)/(模型原始上下文长度)。等于1/rope频率基数。",
"ExLlamav2_HF is recommended over AutoGPTQ for models derived from Llama.":"推荐使用ExLlamav2_HF而非AutoGPTQ,适用于从Llama衍生的模型。",
"load-in-8bit":"以8位量化加载",
"load-in-4bit":"以4位量化加载",
"use_double_quant":"使用双重量化",
"Set use_flash_attention_2=True while loading the model.":"加载模型时设置use_flash_attention_2=True。",
"use_flash_attention_2":"使用flash_attention 2",
"Set attn_implementation= eager while loading the model.":"在加载模型时设置attn_implementation的值为eager。",
"use_eager_attention":"使用eager_attention",
"Use flash-attention.":"使用flash-attention。",
"flash_attn":"使用flash_attn",
"auto-devices":"自动分配设备",
"NVIDIA only: use llama-cpp-python compiled with tensor cores support. This may increase performance on newer cards.":"仅限N卡:使用编译了tensorcores支持的llama-cpp-python。这在新款的RTX显卡上可能可以提高性能。",
"tensorcores":"张量核心",
"Use 8-bit cache to save VRAM.":"使用8位缓存来节省显存。",
"cache_8bit":"8位缓存",
"Use Q4 cache to save VRAM.":"使用4位量化缓存来节省显存。",
"cache_4bit":"4位缓存",
"(experimental) Activate StreamingLLM to avoid re-evaluating the entire prompt when old messages are removed.":"(实验性功能)激活StreamingLLM以避免在删除旧消息时重新评估整个提示词。",
"StreamingLLM: number of sink tokens. Only used if the trimmed prompt doesn't share a prefix with the old prompt.":"StreamingLLM:下沉词符的数量。仅在修剪后的提示词不与旧提示词前缀相同时使用。",
"llama.cpp: Use llama-cpp-python compiled without GPU acceleration. Transformers: use PyTorch in CPU mode.":"llama.cpp:使用没有GPU加速的llama-cpp-python编译。Transformers:使用PyTorch的CPU模式。",
"cpu":"CPU",
"Split the model by rows across GPUs. This may improve multi-gpu performance.":"在GPU之间按行分割模型。这可能会提高多GPU性能。",
"row_split":"行分割",
"Do not offload the K, Q, V to the GPU. This saves VRAM but reduces the performance.":"不将K、Q、V向量转移到GPU。这可以节省VRAM,但会降低性能。",
"no_offload_kqv":"不转移KQV",
"Disable the mulmat kernels.":"禁用mulmat内核。",
"no_mul_mat_q":"禁用mul_mat_q",
"triton":"Triton",
"Affects Triton only. Disable fused MLP. Fused MLP improves performance but uses more VRAM. Disable if running low on VRAM.":"仅影响Triton。禁用融合MLP。融合MLP可以提高性能,但会使用更多的VRAM。如果VRAM不足,请禁用。",
"no_inject_fused_mlp":"不注入融合MLP",
"This can make models faster on some systems.":"在某些系统上,这可以使模型更快。",
"no_use_cuda_fp16":"不使用cuda_fp16",
"'desc_act', 'wbits', and 'groupsize' are used for old models without a quantize_config.json.":"'按递减激活顺序量化'、'权重位'和'组大小'用于没有quantize_config.json的旧模型。",
"desc_act":"按递减激活顺序量化",
"no-mmap":"不使用内存映射",
"mlock":"内存锁定",
"NUMA support can help on some systems with non-uniform memory access.":"NUMA支持可以在具有非统一内存访问的系统上提供帮助。",
"Automatically split the model tensors across the available GPUs.":"自动在可用的GPU之间分割模型张量。",
"autosplit":"自动分割",
"no_flash_attn":"不使用flash_attn",
"no_xformers":"不使用xformers",
"no_sdpa":"不使用sdpa",
"Necessary to use CFG with this loader.":"配合CFG使用此加载器时,必须勾选此项。",
"cfg-cache":"CFG缓存",
"Enable inference with ModelRunnerCpp, which is faster than the default ModelRunner.":"启用ModelRunnerCpp进行推理,它比默认的ModelRunner更快。",
"cpp-runner":"Cpp运行器",
"Number of experts per token":"每个词符的专家数量",
"Only applies to MoE models like Mixtral.":"仅适用于像Mixtral这样的混合专家模型。",
"Set trust_remote_code=True while loading the tokenizer/model. To enable this option, start the web UI with the --trust-remote-code flag.":"加载词符化器/模型时设置trust_remote_code=True。要启用此选项,请使用--trust-remote-code参数启动Web UI。",
"trust-remote-code":"信任远程代码(trust-remote-code)",
"Set use_fast=False while loading the tokenizer.":"加载词符化器时设置use_fast=False。",
"no_use_fast":"不使用快速词符化器",
"Needs to be set for perplexity evaluation to work with this loader. Otherwise, ignore it, as it makes prompt processing slower.":"使用此加载器进行困惑度评估时需要设置。否则,请忽略它,因为它会使提示词处理速度变慢。",
"logits_all":"全部计算Logit",
"Disable ExLlama kernel for GPTQ models.":"对于GPTQ模型,禁用ExLlama内核。",
"disable_exllama":"禁用ExLlama",
"Disable ExLlamav2 kernel for GPTQ models.":"对于GPTQ模型,禁用ExLlamav2内核。",
"disable_exllamav2":"禁用ExLlamav2",
"ExLlamav2_HF is recommended over ExLlamav2 for better integration with extensions and more consistent sampling behavior across loaders.":"相比于ExLlamav2,推荐使用ExLlamav2_HF,因为它与扩展有更好的集成,并且在加载器之间提供了更一致的采样行为。",
"llamacpp_HF loads llama.cpp as a Transformers model. To use it, you need to place your GGUF in a subfolder of models/ with the necessary tokenizer files.":"llamacpp_HF将llama.cpp作为Transformers模型加载。要使用它,您需要将GGUF放在models/的子文件夹中,并提供必要的词符化器文件。",
"You can use the \"llamacpp_HF creator\" menu to do that automatically.":"您可以使用'llamacpp_HF创建器'菜单自动完成。",
"TensorRT-LLM has to be installed manually in a separate Python 3.10 environment at the moment. For a guide, consult the description of":"目前需要在一个单独的 Python 3.10 环境中手动安装 TensorRT-LLM。有关指南,请参阅",
"this PR":"这个 PR",
"is only used when":"仅在选中",
"is checked.":"时使用。",
"cpp_runner":"Cpp运行器",
"does not support streaming at the moment.":"目前不支持流式传输。",
"Whether to load the model as soon as it is selected in the Model dropdown.":"选择模型下拉菜单中的模型后是否立即加载模型。",
"Autoload the model":"自动加载模型",
"Download":"下载",
"llamacpp_HF creator":"llamacpp_HF创建器",
"Customize instruction template":"自定义指令模板",
"Download model or LoRA":"下载模型或LoRA",
"Enter the Hugging Face username/model path, for instance: facebook/galactica-125m. To specify a branch, add it at the end after a \":\" character like this: facebook/galactica-125m:main. To download a single file, enter its name in the second box.":"输入Hugging Face用户名/模型路径,例如:facebook/galactica-125m。要指定分支,在最后加上\":\"字符,例如:facebook/galactica-125m:main。要下载单个文件,请在第二个框中输入其名称。",
"Get file list":"获取文件列表",
"Choose your GGUF":"选择你的GGUF模型",
"Enter the URL for the original (unquantized) model":"输入原始(未量化)模型的URL",
"along with the necessary tokenizer files.":"的子文件夹中,并附带必要的词符化器文件。",
"Select the desired instruction template":"选择所需的指令模板",
"This allows you to set a customized template for the model currently selected in the \"Model loader\" menu. Whenever the model gets loaded, this template will be used in place of the template specified in the model's medatada, which sometimes is wrong.":"这允许你为\"模型加载器\"菜单中当前选中的模型设置一个自定义模板。每当加载模型时,都会使用此模板代替模型元数据中指定的模板,有时后者可能是错误的。",
"No model is loaded":"没有加载模型",
"Train LoRA":"训练LoRA",
"Perplexity evaluation":"困惑度评估",
"Tutorial":"教程",
"Copy parameters from":"从以下LoRA复制参数",
"The name of your new LoRA file":"新LoRA文件的名称",
"If the name is the same, checking will replace the existing file, and unchecking will load and continue from it (the rank must be the same).":"如果名称相同,选中将替换现有文件,未选中将加载并继续(秩必须相同)。",
"Selects which modules to target in training. Targeting more modules is closer to a full fine-tune at the cost of increased VRAM requirements and adapter size.\nNOTE: Only works for model_id='llama', other types will retain default training behavior and not use these settings.":"选择在训练中要针对的模块。针对更多模块更接近完整的微调,但会增加VRAM需求和适配器大小。\n注意:仅对model_id='llama'有效,其他类型将保留默认训练行为,不使用这些设置。",
"Enable q_proj":"启用q_proj",
"Enable v_proj":"启用v_proj",
"Enable k_proj":"启用k_proj",
"Enable o_proj":"启用o_proj",
"Enable gate_proj":"启用gate_proj",
"Enable down_proj":"启用down_proj",
"Enable up_proj":"启用up_proj",
"LoRA Rank":"LoRA秩",
"Also called dimension count. Higher values = larger file, more content control. Smaller values = smaller file, less control. Use 4 or 8 for style, 128 or 256 to teach, 1024+ for fine-detail on big data. More VRAM is needed for higher ranks.":"也称为维度计数。较高的值=更大的文件,更多的内容控制。较小的值=更小的文件,控制力较差。用4或8来表示风格,用128或256来教学,用1024+来细节处理大数据。更高的秩需要更多的VRAM。",
"This divided by the rank becomes the scaling of the LoRA. Higher means stronger. A good standard value is twice your Rank.":"这个除以秩为LoRA的缩放。较高意味着更强。一个好的标准值是秩的两倍。",
"Batch Size":"批量大小",
"Global batch size. The two batch sizes together determine gradient accumulation (gradientAccum = batch / microBatch). Higher gradient accum values lead to better quality training.":"全局批量大小。这两个批量大小共同决定了梯度累积(梯度累积 = 批量大小 / 微批量大小)。较高的梯度累积值会带来更好的训练质量。",
"Micro Batch Size":"微批量大小",
"Per-device batch size (NOTE: multiple devices not yet implemented). Increasing this will increase VRAM usage.":"每个设备的批量大小(注意:多设备尚未实现)。增加这个将增加VRAM使用。",
"Cutoff Length":"截断长度",
"Cutoff length for text input. Essentially, how long of a line of text to feed in at a time. Higher values require drastically more VRAM.":"文本输入的截断长度。本质上说,就是一次输入多长的文本行。较高的值需要大量的VRAM。",
"Save every n steps":"每n步保存一次",
"If above 0, a checkpoint of the LoRA will be saved every time this many steps pass.":"如果大于0,每当这么多步过去时,就会保存LoRA的一个检查点。",
"Epochs":"周期",
"Number of times every entry in the dataset should be fed into training. So 1 means feed each item in once, 5 means feed it in five times, etc.":"数据集中的每个条目应该输入训练的次数。所以1意味着每个项目输入一次,5意味着输入五次,等等。",
"Learning Rate":"学习率",
"In scientific notation. 3e-4 is a good starting base point. 1e-2 is extremely high, 1e-6 is extremely low.":"用科学记数法表示。3e-4是一个很好的起点。1e-2非常高,1e-6非常低。",
"LR Scheduler":"学习率调度器",
"Learning rate scheduler - defines how the learning rate changes over time. \"Constant\" means never change, \"linear\" means to go in a straight line from the learning rate down to 0, cosine follows a curve, etc.":"学习率调度器 - 定义学习率随时间的变化方式。\"Constant\"意味着永不改变,\"linear\"意味着从学习率直线下降到0,cosine遵循曲线等等。",
"Percentage probability for dropout of LoRA layers. This can help reduce overfitting. Most users should leave at default.":"LoRA层的dropout概率百分比。这可以帮助减少过拟合。大多数用户应保持默认值。",
"Stop at loss":"停止损失",
"The process will automatically stop once the desired loss value is reached. (reasonable numbers are 1.5-1.8)":"一旦达到期望的损失值,过程将自动停止。(合理的数字是1.5-1.8)",
"Optimizer":"优化器",
"Different optimizer implementation options, for advanced users. Effects of different options are not well documented yet.":"不同优化器实现选项,供高级用户使用。不同选项的效果尚未得到很好的记录。",
"Warmup Steps":"热身步数",
"For this many steps at the start, the learning rate will be lower than normal. This helps the trainer prepare the model and precompute statistics to improve the quality of training after the start.":"在开始时的这么多步骤中,学习率将低于正常水平。这有助于训练器准备模型并预先计算统计数据,以提高开始后的训练质量。",
"Train Only After":"仅在此之后训练",
"Only consider text *after* this string in any given chunk for training. For Alpaca datasets, use \"### Response:\" to only train the response and ignore the input.":"在任何给定的文本块中,只考虑*在此字符串之后*的文本进行训练。对于Alpaca数据集,使用\"### Response:\"仅训练响应并忽略输入。",
"Adds EOS token for each dataset item. In case of raw text, the EOS will be added at the Hard Cut":"为每个数据集项目添加序列终止符。如果是原始文本,则序列终止符将添加在硬切割处",
"Add EOS token":"添加序列终止符",
"If checked, changes Rank/Alpha slider above to go much higher. This will not work without a datacenter-class GPU.":"如果选中,将更改上面的秩/Alpha滑块,使其更高。如果没有数据中心级GPU,这将不起作用。",
"The format file used to decide how to format the dataset input.":"用于决定如何格式化数据集输入的格式文件。",
"Dataset":"数据集",
"The dataset file to use for training.":"用于训练的数据集文件。",
"Evaluation Dataset":"评估数据集",
"The (optional) dataset file used to evaluate the model after training.":"用于在训练后评估模型的(可选)数据集文件。",
"Evaluate every n steps":"每n步评估一次",
"If an evaluation dataset is given, test it every time this many steps pass.":"如果给出评估数据集,每次训练这么多步后测试它。",
"Text file":"文本文件",
"The raw text file to use for training.":"用于训练的原始文本文件。",
"Overlap Length":"重叠长度",
"How many tokens from the prior chunk of text to include into the next chunk. (The chunks themselves will be of a size determined by Cutoff Length). Setting overlap to exactly half the cutoff length may be ideal.":"在下一个文本块中包含多少个来自前一个文本块的词符。(文本块本身的大小由截断长度决定)。将重叠长度设置为截断长度的恰好一半可能比较理想。",
"Prefer Newline Cut Length":"优先换行剪切长度",
"Length (in characters, not tokens) of the maximum distance to shift an overlap cut by to ensure chunks cut at newlines. If too low, cuts may occur in the middle of lines.":"为了确保文本块在换行处剪切,可移动重叠剪切的最大距离的长度(以字符而非词符数计算)。如果设置得太低,剪切可能会发生在行中间。",
"Hard Cut String":"硬剪切字符串",
"String that indicates a hard cut between text parts. Helps prevent unwanted overlap.":"表示文本部分之间硬剪切的字符串。有助于防止不想要的重叠。",
"Ignore small blocks":"忽略小块",
"Ignore Hard Cut blocks that have less or equal characters than this number":"忽略小于或等于该数字字符的硬剪切块。",
"Start LoRA Training":"开始LoRA训练",
"Interrupt":"中断",
"Ready":"准备就绪",
"Models":"模型",
"Input dataset":"输入数据集",
"The raw text file on which the model will be evaluated. The first options are automatically downloaded: wikitext, ptb, and ptb_new. The next options are your local text files under training/datasets.":"用来进行模型评估的原始文本文件。前几个选项会自动下载:wikitext, ptb, 和 ptb_new。接下来的选项是您在training/datasets下的本地文本文件。",
"Stride":"步长",
"Used to make the evaluation faster at the cost of accuracy. 1 = slowest but most accurate. 512 is a common value.":"以牺牲准确性为代价来加快评估速度。1 = 最慢但最准确。512是一个常见的值。",
"max_length":"最大长度",
"The context for each evaluation. If set to 0, the maximum context length for the model will be used.":"每次评估的上下文长度。如果设置为0,将使用模型的最大上下文长度。",
"Apply flags/extensions and restart":"应用命令行参数/扩展并重启",
"Save UI defaults to settings.yaml":"将UI默认设置保存到settings.yaml",
"Available extensions":"可用扩展",
"Note that some of these extensions may require manually installing Python requirements through the command: pip install -r extensions/extension_name/requirements.txt":"注意,一些扩展可能需要通过命令手动安装Python依赖:pip install -r extensions/extension_name/requirements.txt",