Running a home server
Choosing to Run a Home Server
There are common reasons for running a home server, such as media streaming with Plex or Jellyfin, operating Network Attached Storage (NAS), hosting personal cloud storage with Nextcloud, managing IoT devices, or running a VPN through OpenVPN or WireGuard. However, the most compelling reason I run a home server is to move my personal projects away from large cloud service providers. This choice shields me from the hassles of navigating complex pricing structures, dealing with inflexible service tiers, and facing interruptions from arbitrary policy changes enforced by bots. Moreover, by working closer to the metal, I avoid knowledge being abstracted away from me via managed services.
Running a Hypervisor
I chose to run a hypervisor to easily manage multiple virtual machines (VMs). Hypervisors create isolation, ensuring disruptions or security issues in one VM do not affect others. They facilitate easy setup and teardown of VMs, making it trivial to experiment with new configurations and technologies. Finally, they provide control over the allocation of CPU and storage resources, allowing me to optimize performance based on the needs of each project, all from a single interface.
Choosing Proxmox
Proxmox supports both container and full virtualization, which gives me the flexibility of running lightweight containers or fully isolated VMs. Proxmox’s web-based management interface is practical, offering easy access to setup and maintenance of environments, and simplifies the monitoring and adjustment of resources. Proxmox is more than I need, but it is a reasonable choice and has a strong community and robust learning resources.
Hardware
I needed to strike a balance between cost and performance, particularly for the task of AI model training and refining. I chose the AMD Ryzen 7 7800X3D processor for its high core count and strong single-thread performance, making it a cost-effective choice for parallel processing. G.Skill Trident Z5 RGB 64 GB (2 x 32 GB) DDR5-6400 CL32 Memory sticks provide 64 GB of DDR5 RAM for handling large datasets efficiently. I selected the Samsung 980 Pro 2 TB NVMe SSD, which, although more expensive than traditional SSDs, justifies its cost with faster data access speeds that reduce model training times. Finally, the NVIDIA RTX 4070 GPU, one of the more substantial investments in my setup, was selected for its performance in AI workloads, leveraging its CUDA and Tensor cores optimized for deep learning algorithms. This GPU accelerates computational tasks, significantly shortening training and refining cycles.
Note that 64 GB of RAM is sufficient for large models such as Llama 3. However, if I want to run Llama 3:70b, it will require at least 128 GB of RAM. This means I need to either max out my 4 slots with 32 GB each or sell these and purchase 2 sticks of 64 GB RAM. Even then, 3:70b would be right at the limit for my GPU. I decided that if I wanted to run very large models, it would make sense to build a dedicated server just for that purpose.