Skip to content Skip to sidebar Skip to footer

Why are ESG ratings so different?

Whenever a company, fund manager, fund or individual is interested in the world of sustainability, they are often surprised that ESG ratings are so different.

Pedro Olazabal, Head of Impact at Zubi Group, gives us the keys to understand why this is so and what measures and regulations are being worked on to understand and strengthen trust and comparison between sustainability ratings.

Usually, whenever a company, fund manager, fund or individual is interested in the world of sustainability, they are often surprised that ESG ratings are so different.

It is something we all reflect on because it would be ideal to have a benchmark that tells us “this is sustainable and this is not”. But once you understand how an ESG rating is constructed, then you understand that it is normal for them to be different. The rare thing would be for them to be similar.

The ultimate goal of a rating is to get a “number” that shows me how sustainable a company is. In order to obtain that number, the process of creating an ESG* rating follows 4 steps that can lead to divergences in the final result.

1- Selection of evaluation categories

The first step is to select which evaluation categories are to be considered. This list of relevant categories or topics is not written in stone anywhere. Therefore, each institution generating its own rating is faced with the decision to include or exclude the issues it considers important.

We all have certain issues in our heads that are very likely to be in each of the ratings. For example, climate change, diversity and inclusion, waste, human rights and so on. However, there may be certain issues that are not so well known and may be found in some ratings and not in others. This could be the case for issues such as acidification or product accessibility.

This means that in this first step we can already have different ESG ratings.

 

2- Evaluation or measurement of categories

The second step is to measure each of the categories. For example, how do we measure whether a company is acting on climate change? Do we ask about its Scope 1 and 2 carbon footprint? Do we include Scope 3? Do we ask about its evolution? About its future commitments?

The assessment of each of the selected categories involves asking very pertinent questions in order to be able to grasp its complexity, and how the company is responding to it.

In this step, each ESG rating agency will have a different understanding of what questions each company has to answer. Therefore, at this point we have another point of divergence that is likely to increase the difference between ESG rating results.

 

3- Standardisation

The third step is to standardise. This is what we colloquially refer to as “you can only add pears to pears, not pears to apples”. It turns out that one of the indicators we have obtained is in m3 of water, another in kg CO2, another in % wage gap, and others may be qualitative answers such as “we have a human rights due diligence plan in our supply chain”.

The question that follows is “now how do we get a single number?”, “how do we add up each of these answers?”.

We need to convert apples, bananas and avocados all to pears. And that normalisation process can be very complex or very simple.

In general, in ESG ratings, normalisation processes are usually quite simple and consist of the following:

Each of the questions has different options or scales for assigning the maximum score (1) or other scores (between 0 and 1). For example, for a question on the pay gap, this could be:

  • 0 points if the wage gap is greater than 25%
  • 0.2 points if the pay gap is greater than 20% but less than 25%
  • 0.4 points if the pay gap is greater than 15% but less than 20%
  • 0.6 points if the pay gap is more than 10% but less than 15%
  • 0.8 points if the pay gap is greater than 5% but less than 10%
  • 1 point if the pay gap is less than 5%

This same model can be applied to any type of questions.

In this step it is easy to see that, depending on how it is normalised, the results can be very different. Points can be assigned in very different ways and, therefore, the ratings will also be very different.

 

4- Weighting

The fourth step is weighting. Once all the answers with their number are obtained, the question arises whether all categories should be worth the same. “Is climate change more important than diversity?” Or even some assumptions within a category may be considered more important than others.

Questions that are deemed more important will be weighted more heavily and will influence the final result more.

The weighting step is therefore another step where divergences may arise between different ESG ratings.

And shouldn’t this be dynamic by sector?

Once the rating has been completed, the question always arises: Shouldn’t all this depend on the company’s sector? What about size? What about the countries in which it operates? What about…?

Each of these variables can influence each of the above steps.

  1. Selection of evaluation categories. For example: it can be made sector-dependent, so that some assessment categories are not relevant for some sectors.
  2. Evaluation or measurement of categories. E.g. the number of questions or even the questions themselves can be different depending on the size of the company, the sector, etc.
  3. Standardisation. The points assigned to each answer may vary by sector. For example: a carbon footprint may be very high in one sector and very low in another.
  4. Weighting. The value we assign to each question can be very different depending on the size or sector. For example, diversity in a very small company may be of little relevance, but very relevant in a large company.

 

Tobacco companies and ESG ratings

Every now and then a tobacco company comes along with a high ESG rating. This generates an interesting public debate.

Without getting into the ethical debate, what happens in these cases is that ESG ratings tend to focus on the operations of companies but not on the products and services they provide to the market.

Thus, a tobacco company may have good practices in its operations (net zero strategy, supply chain that respects human rights, no glass ceiling, etc.) even though it has a product that is harmful to health. This can lead to a high rating.

It is not the purpose of this blog post to give our views on this, but to describe what explains why this happens.

 

Conclusions

The elaboration of a sustainability rating involves a process of several unavoidable steps. Each of these steps requires decisions that will influence the final result. And these decisions depend on each institution and it is not uncommon for them to be different for each rating agency.

The role of the regulator

In response to this, the EU is already starting to show signs that it is going to organise this issue. The commission has announced that it is proposing a regulation for ESG ratings providers: “This initiative aims to strengthen the reliability and comparability of ESG ratings”. We will keep an eye on this.

 

*In reality this applies to any synthetic indicator. A synthetic or aggregate indicator is one that synthesises or aggregates several indicators in an attempt to capture complex realities.Whenever a company, fund manager, fund or individual is interested in the world of sustainability, they are often surprised that ESG ratings are so different.

Pedro Olazabal, Head of Impact at Zubi Group, gives us the keys to understand why this is so and what measures and regulations are being worked on to understand and strengthen trust and comparison between sustainability ratings.

Usually, whenever a company, fund manager, fund or individual is interested in the world of sustainability, they are often surprised that ESG ratings are so different.

It is something we all reflect on because it would be ideal to have a benchmark that tells us “this is sustainable and this is not”. But once you understand how an ESG rating is constructed, then you understand that it is normal for them to be different. The rare thing would be for them to be similar.

The ultimate goal of a rating is to get a “number” that shows me how sustainable a company is. In order to obtain that number, the process of creating an ESG* rating follows 4 steps that can lead to divergences in the final result.

1- Selection of evaluation categories

The first step is to select which evaluation categories are to be considered. This list of relevant categories or topics is not written in stone anywhere. Therefore, each institution generating its own rating is faced with the decision to include or exclude the issues it considers important.

We all have certain issues in our heads that are very likely to be in each of the ratings. For example, climate change, diversity and inclusion, waste, human rights and so on. However, there may be certain issues that are not so well known and may be found in some ratings and not in others. This could be the case for issues such as acidification or product accessibility.

This means that in this first step we can already have different ESG ratings.

 

2- Evaluation or measurement of categories

The second step is to measure each of the categories. For example, how do we measure whether a company is acting on climate change? Do we ask about its Scope 1 and 2 carbon footprint? Do we include Scope 3? Do we ask about its evolution? About its future commitments?

The assessment of each of the selected categories involves asking very pertinent questions in order to be able to grasp its complexity, and how the company is responding to it.

In this step, each ESG rating agency will have a different understanding of what questions each company has to answer. Therefore, at this point we have another point of divergence that is likely to increase the difference between ESG rating results.

 

3- Standardisation

The third step is to standardise. This is what we colloquially refer to as “you can only add pears to pears, not pears to apples”. It turns out that one of the indicators we have obtained is in m3 of water, another in kg CO2, another in % wage gap, and others may be qualitative answers such as “we have a human rights due diligence plan in our supply chain”.

The question that follows is “now how do we get a single number?”, “how do we add up each of these answers?”.

We need to convert apples, bananas and avocados all to pears. And that normalisation process can be very complex or very simple.

In general, in ESG ratings, normalisation processes are usually quite simple and consist of the following:

Each of the questions has different options or scales for assigning the maximum score (1) or other scores (between 0 and 1). For example, for a question on the pay gap, this could be:

  • 0 points if the wage gap is greater than 25%
  • 0.2 points if the pay gap is greater than 20% but less than 25%
  • 0.4 points if the pay gap is greater than 15% but less than 20%
  • 0.6 points if the pay gap is more than 10% but less than 15%
  • 0.8 points if the pay gap is greater than 5% but less than 10%
  • 1 point if the pay gap is less than 5%

This same model can be applied to any type of questions.

In this step it is easy to see that, depending on how it is normalised, the results can be very different. Points can be assigned in very different ways and, therefore, the ratings will also be very different.

 

4- Weighting

The fourth step is weighting. Once all the answers with their number are obtained, the question arises whether all categories should be worth the same. “Is climate change more important than diversity?” Or even some assumptions within a category may be considered more important than others.

Questions that are deemed more important will be weighted more heavily and will influence the final result more.

The weighting step is therefore another step where divergences may arise between different ESG ratings.

And shouldn’t this be dynamic by sector?

Once the rating has been completed, the question always arises: Shouldn’t all this depend on the company’s sector? What about size? What about the countries in which it operates? What about…?

Each of these variables can influence each of the above steps.

  1. Selection of evaluation categories. For example: it can be made sector-dependent, so that some assessment categories are not relevant for some sectors.
  2. Evaluation or measurement of categories. E.g. the number of questions or even the questions themselves can be different depending on the size of the company, the sector, etc.
  3. Standardisation. The points assigned to each answer may vary by sector. For example: a carbon footprint may be very high in one sector and very low in another.
  4. Weighting. The value we assign to each question can be very different depending on the size or sector. For example, diversity in a very small company may be of little relevance, but very relevant in a large company.

 

Tobacco companies and ESG ratings

Every now and then a tobacco company comes along with a high ESG rating. This generates an interesting public debate.

Without getting into the ethical debate, what happens in these cases is that ESG ratings tend to focus on the operations of companies but not on the products and services they provide to the market.

Thus, a tobacco company may have good practices in its operations (net zero strategy, supply chain that respects human rights, no glass ceiling, etc.) even though it has a product that is harmful to health. This can lead to a high rating.

It is not the purpose of this blog post to give our views on this, but to describe what explains why this happens.

 

Conclusions

The elaboration of a sustainability rating involves a process of several unavoidable steps. Each of these steps requires decisions that will influence the final result. And these decisions depend on each institution and it is not uncommon for them to be different for each rating agency.

The role of the regulator

In response to this, the EU is already starting to show signs that it is going to organise this issue. The commission has announced that it is proposing a regulation for ESG ratings providers: “This initiative aims to strengthen the reliability and comparability of ESG ratings”. We will keep an eye on this.

 

*In reality this applies to any synthetic indicator. A synthetic or aggregate indicator is one that synthesises or aggregates several indicators in an attempt to capture complex realities.