numGpu method

OllamaBuilder numGpu(
  1. int gpuLayers
)

Sets the number of GPU layers to use

Controls how many layers of the model are loaded onto the GPU. More layers on GPU means faster inference but higher GPU memory usage.

  • 0: CPU only (slowest, lowest memory)
  • -1: Load all layers on GPU (fastest, highest memory)
  • Positive number: Load specified number of layers on GPU

Implementation

OllamaBuilder numGpu(int gpuLayers) {
  _baseBuilder.extension('numGpu', gpuLayers);
  return this;
}