Changes in version 0.5.0 - Upgrade to whisper.cpp version v1.8.2 - Move build system to cmake - Add option to use flash-attention - Add option to use integrated Voice Activity Detection using Silero VAD model v5.1.2 - Updated whisper_print_benchmark Changes in version 0.4.2 - Allow to pass no_timestamps to predict.whisper Changes in version 0.4.1 - Added function predict.whisper_transcription which allows to assign a transcription segment to either a left/right channel based on a Voice Activity Detection Changes in version 0.4 - Allow to pass on multiple offset/durations - Allow to give sections in the audio (e.g. detected with a voice activity detector) to filter out these (voiced) data, make the transcription and make sure to add the amount of time which was cut out to the from/to timestamps such that the resulting timepoints in from/to are aligned to the original audio file - The data element of the predict.whisper now includes a column called segment_offset indicating the offset of the provided sections or offsets Changes in version 0.3.3 - Fixes of typos in documentation of functions - Add stereo.wav file - Allow to do diarization for audio with 2 channels by comparing the energy of the signal in each channel for each segment Changes in version 0.3.2 - Documentation of arguments in predict.whisper - Add option to download quantised models - tiny-q5_1, tiny.en-q5_1 - base-q5_1, base.en-q5_1 - small-q5_1, small.en-q5_1 - medium-q5_0, medium.en-q5_0 - large-v2-q5_0 and large-v3-q5_0 - Allow to disable printing the transcription evolution during the prediction with the trace argument - Enable O3 optimisations by default - Allow speedup of transcriptions by compiling with cuBLAS against CUDA on Linux - specify Sys.setenv(WHISPER_CUBLAS = "1") before installing the package if you have a GPU with CUDA Changes in version 0.3.1 - Makevars - Added detection of AVX512F for adding compilation flags to PKG_CFLAGS/PKG_CPPFLAGS - Enable Metal for speeding up transcriptions on the GPU on Mac - Enable compiling with OpenBLAS to speed up the transcriptions - Add whisper_languages to get a data.frame of all languages the Whisper model can handle - whisper_download_model - change default timeout to 10 minutes if no timeout is set by the user + change output element in the list to 'download_success' instead of 'download_failed' - model_dir now defaults to the directory set in the environment variable WHISPER_MODEL_DIR and if this is not set, the current working directory - whisper - Add option use_gpu to be able to run the prediction on a GPU (e.g. Metal) - predict.whisper - Add option to pass on initial prompt - Output of predict.whisper adds the audio duration of the wav file in seconds in the params list element - Gains an extra argument indicating to transcribe or translate Changes in version 0.3 - Upgrade to whisper.cpp version v1.5.4 - whisper_download_model allows to download 'large-v1', 'large-v2', 'large-v3' while model 'large' should no longer be used Changes in version 0.2.2 - Add option to pass on float entropy_thold (similar to compression_ratio_threshold), logprob_thold, beam_size, best_of, split_on_word, max_context when doing the prediction - Output of the predict.whisper function now includes an element called timing indicating how long it took to do the transcription - whisper gains 2 arguments: the model_dir/overwrite which is passed directly to whisper_download_model - whisper_download_model - gains an argument version which defaults to models for whisper.cpp version 1.2.1 - gets the models now from https://huggingface.co/ggerganov/whisper.cpp/resolve/80da2d8bfee42b0e836fc3a9890373e5defc00a6 instead of https://huggingface.co/ggerganov/whisper.cpp/resolve/main Changes in version 0.2.1-1 - whisper_download_model now Deprecates downloading from https://ggml.ggerganov.com and changed the URL's to download models from huggingface (Issue #18) Changes in version 0.2.1 - Add option to compile with own PKG_CFLAGS by setting environment variable WHISPER_CFLAGS - Add option to compile with extra PKG_CPPFLAGS by setting environment variable WHISPER_CPPFLAGS Changes in version 0.2.0 - Incorporate whisper.cpp version v1.2.1 Changes in version 0.1.3 - Ongoing work on improving compilation instructions to speed up transcribing while still being CRAN compliant - Add whisper_benchmark Changes in version 0.1.2 - Incorporate whisper.cpp release 1.0.4: https://github.com/ggerganov/whisper.cpp/releases/tag/1.0.4 and up to commit 99da1e5cc853f7cdd61d2f259c8d770ea9279d29 - predict.whisper now uses 'auto' as default language - predict.whisper now sets resulting text with UTF-8 encoding Changes in version 0.1.1 - Incorporate https://github.com/ggerganov/whisper.cpp/pull/257 (Remove C++20 requirement) Changes in version 0.1.0 - Initial version based on - https://github.com/ggerganov/whisper.cpp commit 85c9ac18b59125b988cda40f40d8687e1ba88a7a - https://github.com/mackron/dr_libs commit dd762b861ecadf5ddd5fb03e9ca1db6707b54fbb - Added whisper - whisper_download_model, whisper and predict.whisper