Networking a village @ETHIndia
In the world of high-stakes hackathons, there is a fundamental hierarchy of needs. Most people know the classic Indian adage: Roti, Kapda, aur Makaan (Food, Clothing, and Shelter). But for 800 hackers descending upon ETHIndia, that hierarchy had a critical fourth pillar: Solid, high-speed Internet.
The Evolution: Learning from the past
To understand 2024, you have to look back to our experience from previous years:
- 2022: Without authentication, the network was overwhelmed as participants connected multiple devices simultaneously. When performance dipped, hackers naturally switched to personal mobile hotspots, triggering an avalanche of RF interference that rendered the venue’s Wi-Fi unusable.
We realized we had to strictly limit connections to one device per person and block bandwidth-heavy speed test websites to protect the network's capacity. Moving forward, this also highlighted the need for a dedicated “LAN Police” team to identify and penalize personal hotspot usage, while educating hackers to use USB tethering as the only acceptable mobile backup.
An RCA is also published for networking at ETHIndia 2022:
- 2023: We introduced Ethernet to every table, but the daisy-chained infrastructure proved brittle. Since switches were hidden under tables, a participant accidentally kicking a cable would crash entire sections, making debugging a physical nightmare. Logically, we also battled a rogue DHCP server likely from a stray machine. However, because the network was not yet segmented, we could never perform a definitive Root Cause Analysis.
Our mantra for 2024 was simple: Divide and Conquer.
The 2024 Architecture: Redundancy upon Redundancy
This year, we started planning four months in advance. We decided on a dual-track approach: a managed service by our Networking Partner and an in-house Disaster Recovery (DR) Setup that we built, managed and controlled entirely.
1. The Main Line
We secured a 3 Gbps Internet Leased Line (ILL) with a ring topology (two separate fiber terminations for redundancy).
- Hardware: We worked with our internet vendor to procure Ruckus Wi-Fi 6 High-Density Access Points. APs manufactured post-2017 handle congestion significantly better than the older Wi-Fi 5 models.
- Zoning: We physically and logically divided the hall into four zones. If one switch failed or one zone faced an attack, the rest of the hall remained unaffected.
2. The Disaster Recovery (DR) Setup
We wanted to build our own expertise. We deployed a "beast" of a router—the MikroTik CCR2116—capable of running internet for a small city.
- Active-Active Mode: Our DR setup wasn't just sitting there; it was actively serving about 10% of the venue at all times to ensure it was "warm" and ready to take over 100% of the load if the main system failed.
- Zoning & VLANs: We used six different VLANs (Zones 1-4, Backup, and Wi-Fi) with strict firewall rules. No intra-VLAN talking was allowed, just pure, isolated internet traffic. We also ran Simple Queues with PCQ to cap each user at 20 Mbps with bursts up to 50 Mbps. Fast enough to feel snappy, controlled enough to keep the pipe healthy.
- Live Telemetry: We monitored the CCR2116 end-to-end with Prometheus and Grafana over SNMP. CPU, RAM, temperature, and per-interface traffic, all in real time. With one zone running through it permanently, we had continuous proof that the DR path worked under real load, not just in theory. The screenshot below is from peak hours: sub-4% RAM, flat CPU across all 16 cores.
Innovation in the NOC: The "Ping" Device
One of the highlights of our setup was a custom device we built: P.I.N.G (raspberry PI Network Gauge)
In previous years, debugging required carrying a MacBook (and a dongle!) to every switch, and every table. This year, we used a handheld Pi with a screen that ran real-time pings, traceroutes, DNS resolution tests, and speed tests the moment it was plugged into any Ethernet port.
It was so effective that the ISP engineers asked us to build 10 units for their own internal use (xD)! It turned the "dark art" of network debugging into a simple, handheld reality.
Swag with a Purpose
In 2023, we handed out Ethernet cables on request. But when Wi-Fi buckled under load, we had to scramble to get USB-C dongles into every participant’s hands. We had rate limiting in place via captive portals bound to MAC addresses, but participants swapped dongles constantly, so the limits were never associated to the right person.
For 2024, we skipped the scramble entirely. Every participant received a 1 Gbps USB-C to Ethernet adapter as part of their swag, branded, personal, and theirs to keep. With everyone on a dedicated dongle, we dropped the captive portal and usage limiting altogether., and a per-ip rate limit was enforced on the firewall.
On-ground Challenges
The main internet vendor’s captive portal was accidentally bound to 1.1.1.1 — Cloudflare’s public DNS. Anyone resolving through Cloudflare lost internet access, and Telegram (our primary comms channel) went dark with it.
With a large venue, hundreds of hackers moving around, and cables getting accidentally disconnected, we had to be obsessive about cable management, zip-tying runs at 2 AM to keep everything locked down.
Then there was KTPO’s metal roof. Signals bounce off it in every direction, turning the hall into an RF mess. We patched Wi-Fi gaps in specific sub-networks using TP-Link C6 Archers.
The Result: Invisible Success
In networking, if nobody is talking about you, you’ve won.
We received feedback from sponsors and hackers alike that this was the most stable internet they had ever experienced at a hackathon.
The Lessons for the future
- Nobody cares about your architecture — until it breaks. We ran dual ISPs, six VLANs, and active-active failover. Not one hacker asked how. That’s the point. The best infrastructure is the kind nobody has to think about.
- Give people the hardware, don’t make them ask. In 2023, we scrambled to hand out dongles mid-event. In 2024, every participant walked in with a branded Ethernet adapter in their swag bag. One change killed the captive portal, the MAC tracking mess, and the support queue in one shot.
- Build your own debugging tools. Carrying a MacBook and a dongle to every table doesn’t scale. A 50$ Raspberry Pi with a screen replaced an entire NOC workflow — and the ISP’s engineers wanted ten of them.
- Segment everything. Trust nothing. A rogue DHCP server in 2023 took us down and we couldn’t even trace it. Strict VLANs and firewall rules in 2024 meant one zone could burn without taking the hall with it.
- RF is the real boss fight. Metal roofs, 800 personal hotspots, Wi-Fi 5 APs from 2015 — no amount of bandwidth fixes physics. Ban hotspots, enforce wired-first, and zip-tie your cable runs at 2 AM like the rest of us.
Until next time, never stop surfing 🏄♂️