numBatch method

OllamaBuilder numBatch(
  1. int batchSize
)

Sets the batch size for processing

Controls the number of tokens processed in parallel during inference. Larger batch sizes can improve throughput but use more memory.

  • Default: 512
  • Range: 1-2048 (depends on available memory)
  • Higher values: Better throughput, more memory usage
  • Lower values: Less memory usage, potentially lower throughput

Implementation

OllamaBuilder numBatch(int batchSize) {
  _baseBuilder.extension('numBatch', batchSize);
  return this;
}