Configuration
Configuration files
There are a few server configuration files that you can use to customize the server. These files are located in the config
directory of the server.
config.py
This has things like the Flask app configuration.
Here is an example of the config.py
file:
TRANSCRIPTION_MODEL = "large-v3"
CATEGORIZATION_MODEL = "gemma" # You can also use "gpt" but that involves setting up your API key
SUMMARIZATION_MODEL = "huggingface" # You can also use "gpt"
ENV_TYPE = "dev"
UPLOAD_FOLDER = "raw_audio/"
More information about how to set up your API key for the GPT models can be found in the OpenAI documentation Note that using GPT requires MONEYS.
Note that UPLOAD_FOLDER
is the directory where the raw audio files are stored. This is relative to root
so the default is classifAI-engine/raw_audio/
.
Server configuration
This runs on a few different services, so there are diffierent ways to view logs.
To view the logs for ClassifAI-engine, the flask server, you can run the following command:
sudo journalctl -u classifai
The Flask server service file is located at /etc/systemd/system/classifai.service
.
You can see it through sudo systemctl cat classifai
.
To view the logs for the Redis queue worker, you can run the following command:
sudo supervisorctl tail -5000 procname stderr
sudo supervisorctl tail -f procname stdout
This is also run through the supervisor service file located at /etc/supervisor/supervisord.conf
.
Commands are imported through the src
directory.
Environment variables
There's a .env
file in the root directory of the server that you can use to set environment variables.
This has ENV, REDIS_PORT, and GEMMA_API_URL.
Here is an example of the .env
file:
ENV=development # This overrides whatever is in src/config/config.py. You can delete this line if you want to use the config.py file
REDIS_PORT=6379 # This is the default port for Redis
GEMMA_API_URL=http://localhost:5001 # Notice that the ClassifAI-engine Flask app is running on port 5000 by default, so the GEMMA API is running on port 5001
LLAMA_API_URL=http://localhost:5003 # This is the URL for the LLAMA API
HF_TOKEN=YOUR_HUGGINGFACE_API_KEY # This is the API key for the HuggingFace API
OPENAI_API_KEY=YOUR_OPENAI_API_KEY # This is the API key for the OpenAI API (Optional)
config.py
file, so if you set it in the .env
file, it will override the config.py
file.
Furthermore, if you are using the GPT models, you will need to set the OPENAI_API_KEY
variable in the .env
file. Otherwise, it will not work.
Cronjobs
There's one cronjob that runs every day to remove old audio files. This is simple enough that it's just in the crontab
file.
We permanently store audio_files in the web server, so there's no need to keep them in the raw_audio directory for more than a few days. To edit the cronjobs, you can run the following command:
sudo crontab -e
# Below is the cronjob, if you're interested:
5 3 * * * find /home/classgpu/classifAI-engine/src/raw_audio/ -mtime +3 -type f -delete
# This will delete all files in the raw_audio directory that are older than 3 days, every day at 3:05 AM
Supervisor
The server uses Supervisor to manage the Redis queue worker. The configuration file is located at /etc/supervisor/supervisord.conf
.
More information about Supervisor can be found in the Supervisor documentation.