dYdX Operations DAO
Site Reliability Engineer
As the dYdX ecosystem nears the release of a fully decentralized and open-sourced version of the protocol - the Mainnet launch of dYdX V4, it is crucial for the dYdX community to actively engage in promoting the growth and expansion of the dYdX DAO. The dYdX DAO is expected to comprise several autonomous subDAOs, each focusing on core functional areas of the dYdX protocol and ultimately being accountable to the dYdX community. Encompassing all community programs and initiatives, the dYdX DAO aims to further the vision of complete decentralization.
The dYdX Operations Trust (“DOT”) emerged as the second subDAO, established through an on-chain proposal that passed on December 18th, 2022. After the on-chain proposal passed, the DOT was created as a Guernsey non-charitable Purpose Trust entity.
At the dYdX Operations Trust, you will have an opportunity to foster community-driven growth over decentralized technology that will redefine global financial markets. By joining at this early stage, you will play a vital role in making fundamental decisions that shape the trajectory of the dYdX ecosystem. In this crucial position, you will manage the day-to-day operations of the DOT, supervise strategic projects, and ensure task execution in alignment with the goals of the broader dYdX DAO. Your efforts will be key to fostering collaboration, operational efficiency, and exceptional performance throughout the ecosystem.
About The Role
We are seeking a Site Reliability Engineer to join the dYdX Operations Trust as one of its first full-time hires. As an SRE, you will ensure our services are performant/ predictable, develop and automate various services, and help maintain enterprise-grade scalable production systems focusing on stability, availability, and security. You will play a key role in improving processes and work on proactively identifying issues to be resolved.
What You’ll be Doing
- Ensuring the reliability, scalability, and efficiency of the dYdX DAO’s decentralized infrastructure system
- Ensure high availability, latency, performance, capacity, scalability, and deployment of the dYdX DAO’s infrastructure
- Excellent analytical and problem-solving skills, with the ability to debug, optimize code and automate routine tasks
- Assist in the design and improvement of monitoring, alerting and remediation solutions with a focus on proactively identifying and addressing production issues
- Implement monitoring, logging, and alerting systems to proactively identify and resolve performance issues or potential failures
- Investigate and lead efforts to remediate critical operational productions issues
- Coach teams across the dYdX ecosystem on best practices for deployment, observability and scalability
- Continuously monitor system performance, identify bottlenecks, and optimize resource utilization to maintain high availability and response times
- Troubleshoot and resolve incidents and outages, collaborating with developers, community members, and external partners when required
- Maintain and create technical documentation of infrastructure as code/data, while also having knowledge of version control using tools
- Support security audits and vulnerability assessments to identify potential risks and implement appropriate safeguards
- Support services through activities such as system design consulting, developing software platforms and frameworks, and capacity planning.
- Living in Europe, Middle East, or Asia.
- Highly proficient in both written and spoken English
- Minimum 3-5 years experience working in systems/software Engineering
- Experience programming in Go, Python, Java, C++, or C
- Previous work with algorithms, data structures and/or open source software
- A “get it done” attitude – bias toward action, great collaboration and master disambiguation – constantly pushing toward clarity and delivery
- In depth understanding of DAOs / previous experience being a DAO Contributor
- Communicate effectively and in a structured manner across all mediums