The concept of SRE originated at Google in 2003, and its credit goes to Ben Treynor Sloss. He was the originator of the term SRE, who handled a production team of seven engineers. Ben Treynor asked his team members to spend half of their time on operations tasks. It helps the team to get a better understanding of how to develop software. Besides, it helped him to complete tasks successfully.
A site reliability engineer (SRE) acts as a link between development and IT operations and performs the duties normally done by the operations. Usually, these engineers use automation technologies to solve problems by developing scalable and trustworthy software systems. The primary goal of SRE is to create software systems and automated solutions for operational issues. As a result, SRE performs the work traditionally performed by operations. They utilize engineers with software expertise to solve complex problems.
The main role of SRE teams is writing and developing code to automate processes like analyzing logs, testing production environments, and responding to issues. The engineer who uses this approach will become an expert in writing code. It also allows developers to focus solely on feature development and bring new features to production. An SRE can automate solutions to any recurring problem and reduce the workload of the operations team.