Orivel Orivel
Open menu

Summarize a City Heat Adaptation Proposal for Residents

Compare model answers for this Summarization benchmark and review scores, judging comments, and related examples.

Login or register to use likes and favorites. Register

X f L

Contents

Task Overview

Benchmark Genres

Summarization

Task Creator Model

Answering Models

Judge Models

Task Prompt

Read the source passage below and write a concise summary for a general public audience. Your summary must: - be 180 to 240 words - be written as a single coherent prose paragraph - use neutral, informative language - preserve the main problem, the proposed actions, the trade-offs, the timeline, the funding approach, and the community concerns - mention at least five distinct measures in the plan - avoid copying long phrases from the source - not add outside facts or opinions Source passage: The city of Marenton...

Show more

Read the source passage below and write a concise summary for a general public audience. Your summary must: - be 180 to 240 words - be written as a single coherent prose paragraph - use neutral, informative language - preserve the main problem, the proposed actions, the trade-offs, the timeline, the funding approach, and the community concerns - mention at least five distinct measures in the plan - avoid copying long phrases from the source - not add outside facts or opinions Source passage: The city of Marenton has spent the past decade trying to understand why summer heat has become one of its most expensive and politically divisive public problems. Average temperatures have risen gradually, but what has changed more dramatically is the number of hot nights, when apartment buildings fail to cool down and residents get little relief before the next day. Public health records show that emergency calls for heat-related distress are concentrated not only during headline-grabbing heat waves but also during longer stretches of moderately high temperatures. These periods are especially difficult in the inner districts, where tree cover is sparse, older buildings trap heat, and many lower-income residents cannot afford efficient cooling. City engineers describe this as a combined infrastructure and equity problem: asphalt-heavy streets store heat, stormwater systems are stressed by intense summer downpours, and neighborhoods with the fewest parks often have the highest asthma rates as well as the highest surface temperatures. Two years ago, the mayor asked the Department of Planning, the public hospital network, the transit agency, and three neighborhood coalitions to produce a joint adaptation proposal. Their report does not promise a quick technological fix. Instead, it argues that the city needs a layered response that changes streets, buildings, public services, and emergency communication at the same time. The report warns that isolated pilot projects have looked impressive in photographs but have done little at city scale. It recommends concentrating first on eight heat-vulnerable districts, chosen through a combination of temperature mapping, health data, rental burden statistics, and the share of elderly residents living alone. Officials say this targeting is meant to direct resources where the risk is greatest, though critics worry it may leave other neighborhoods feeling ignored. The most visible part of the proposal is a street redesign program. Over six years, the city would replace dark pavement on selected corridors with lighter, more reflective surfaces and expand tree planting with species judged likely to survive hotter summers. Bus stops in the priority districts would be retrofitted with shade canopies, seating, water refill points, and digital displays showing heat alerts and nearby cooling sites. On school grounds, large paved yards would be partially converted into shaded play areas and rain-absorbing gardens. Supporters say these changes would reduce local temperatures, make public space usable during hotter months, and lower flooding after cloudbursts. Public works staff, however, note that reflective materials can increase glare, tree roots may damage sidewalks if poorly planned, and maintenance budgets are already stretched. Buildings are the second major focus. The report proposes a revised building code requiring better roof insulation, exterior shading for large new residential projects, and “cool roof” standards for municipal buildings undergoing renovation. For existing apartment blocks, especially those built between 1950 and 1985, the city would offer grants and low-interest loans for insulation, window upgrades, cross-ventilation improvements, and common-area cooling rooms that residents could use during extreme heat. Landlord associations support some efficiency upgrades but oppose any rules they think could trigger mandatory retrofits without financial assistance. Tenant groups, meanwhile, fear that building improvements could be used to justify rent increases or temporary displacement if protections are weak. Because heat risk is also a public health issue, the report recommends a new response system coordinated by clinics, social workers, libraries, and emergency management staff. Instead of treating cooling centers as a last resort opened only during emergencies, the city would create a tiered network: libraries, schools, and recreation centers would operate as daytime cooling sites during forecast heat events, while a smaller set of facilities with backup power would remain open overnight in severe conditions. A registry would allow elderly residents and people with certain chronic illnesses to request wellness calls or transport assistance, though enrollment would be voluntary because privacy concerns are expected. The health department also wants pharmacists and primary care providers to distribute simple guidance on hydration, medication storage, and recognizing early symptoms of heat stress. Some civil liberties advocates have said that even a voluntary registry could gradually expand beyond its original purpose if data governance rules are unclear. Transit and labor policy appear in the proposal as well. The transit agency wants to prioritize air-conditioning repairs on bus lines serving the hottest districts and test heat-resilient platform materials at three major tram interchanges. The city would also revise procurement rules so that companies bidding on summer public works contracts must submit worker heat-safety plans, including rest breaks, access to water, and adjusted schedules during peak afternoon temperatures. Business groups generally accept the safety logic but argue that the rules could increase project costs and delay road repairs. Worker advocates respond that heat illness, absenteeism, and compensation claims also carry costs, and that low-wage outdoor workers face risks that are often minimized because they are less visible than hospital emergencies. Funding remains the most contested section of the report. The estimated six-year cost is 420 million local currency units. Roughly a third would come from the city’s capital budget, another third from national climate-resilience grants that are not yet guaranteed, and the remainder from municipal green bonds and utility-sector partnerships. To reassure skeptical council members, the report proposes phased implementation with annual public evaluations, allowing later stages to be adjusted if benefits are weaker than expected or if financing falls short. Still, opponents argue that relying on uncertain grant money is fiscally risky. Others counter that delaying adaptation will be more expensive because heat damage is cumulative: road surfaces degrade faster, hospital surges disrupt routine care, and productivity falls when schools, transit, and workplaces cannot function well in prolonged heat. The proposal’s timeline reflects that tension between urgency and caution. In the first year, the city would finalize district selection, create design standards, launch the health communication campaign, and begin small demonstration projects at ten bus stops, two schools, and four libraries. Years two and three would focus on construction in the priority districts, opening overnight cooling facilities, and starting the apartment retrofit financing program. Years four through six would expand successful measures to additional corridors and evaluate whether any building code requirements should be tightened. The report repeatedly stresses that adaptation is not a substitute for reducing emissions; it presents local heat planning as damage limitation rather than a complete solution. Public reaction has been mixed but unusually substantive. Residents in hotter districts have described the plan as the first official document that reflects their lived experience of sleepless nights, expensive electricity bills, and fear of checking on frail relatives during heat alerts. Parents have welcomed shaded schoolyards, and disability advocates have praised the attention to seating, transport assistance, and overnight facilities. At the same time, some residents in coastal and hillside neighborhoods say they also face dangerous heat but may be excluded from early investment because they live outside the first eight districts. Small landlords say the city is underestimating compliance burdens. Environmental groups support the emphasis on trees and cooler streets but criticize the report for not setting measurable canopy targets citywide. At next month’s council session, the proposal is expected to pass in some form, though amendments are likely. Several council members want stronger anti-displacement rules tied to building grants, while fiscal conservatives want spending to be automatically paused if national grants do not materialize. The mayor has signaled openness to both ideas as long as they do not delay the first-year actions. Behind the political bargaining is a broader shift in how the city describes climate risk. Heat was once treated as an occasional weather emergency. The report argues that it should now be treated as a recurring urban systems challenge that touches housing, health, transport, labor standards, and public trust.

Judging Policy

A good summary accurately condenses the passage’s central argument and major components of the city’s heat adaptation proposal for non-expert readers. It should clearly state why heat is a growing problem, identify multiple concrete measures across streets, buildings, health services, transit, and labor policy, and include the main funding structure and timeline. It should also reflect key trade-offs and criticisms, including concerns about targeting only certain districts, maintenance and cost issues, privacy worr...

Show more

A good summary accurately condenses the passage’s central argument and major components of the city’s heat adaptation proposal for non-expert readers. It should clearly state why heat is a growing problem, identify multiple concrete measures across streets, buildings, health services, transit, and labor policy, and include the main funding structure and timeline. It should also reflect key trade-offs and criticisms, including concerns about targeting only certain districts, maintenance and cost issues, privacy worries, landlord-tenant tensions, and fiscal uncertainty. Strong answers are concise, well organized, factually faithful, and free of outside information or editorializing.

Answers

Marenton is experiencing increasing summer heat, especially hot nights, leading to public health challenges concentrated in inner districts with limited tree cover and older buildings. A comprehensive adaptation proposal targets eight heat-vulnerable districts, though some critics worry about excluding other neighborhoods. The plan includes a six-year street redesign program to replace dark pavement with reflective surfaces, expand tree planting, and retrofit bus stops with shade and water points. Schoolyards will also be converted into shaded play areas and rain gardens. Building code revisions will mandate bett...

Show Full Answer

Marenton is experiencing increasing summer heat, especially hot nights, leading to public health challenges concentrated in inner districts with limited tree cover and older buildings. A comprehensive adaptation proposal targets eight heat-vulnerable districts, though some critics worry about excluding other neighborhoods. The plan includes a six-year street redesign program to replace dark pavement with reflective surfaces, expand tree planting, and retrofit bus stops with shade and water points. Schoolyards will also be converted into shaded play areas and rain gardens. Building code revisions will mandate better insulation and cool roofs for new and municipal buildings, while grants and low-interest loans will support upgrades for existing apartment blocks, including common-area cooling rooms. A new public health system will establish a tiered network of cooling centers, from daytime library sites to overnight facilities, and offer a voluntary registry for wellness checks, though privacy concerns have been raised. Transit improvements will prioritize air-conditioning repairs on bus lines serving the hottest districts, and public works contracts will require worker heat-safety plans, which businesses fear could increase project costs. The estimated 420 million local currency unit funding relies on city capital, uncertain national grants, green bonds, and utility partnerships, prompting fiscal risk concerns. The timeline begins with demonstration projects and health campaigns in year one, expanding construction and retrofit programs in years two and three, and evaluating measures through year six. Community reactions are mixed, with vulnerable residents welcoming the plan, but others feeling excluded or concerned about compliance burdens and the lack of citywide canopy targets.

Result

#2

Winning Votes

0 / 3

Average Score

78
Judge Models OpenAI GPT-5.2

Total Score

84

Overall Comments

Accurate and detailed summary that captures the core problem (hot nights and inequities), the multi-agency plan, major measures across sectors, funding split, timeline phases, and several community concerns. However, it is over the 240-word limit, which weakens compliance with the required compression constraint despite otherwise strong coverage and faithfulness.

View Score Details

Faithfulness

Weight 40%
94

Accurately reflects the passage’s claims on heat dynamics, targeted districts, major interventions, funding mix, timeline, and key objections; no clear added outside facts.

Coverage

Weight 20%
92

Covers nearly all major elements: streets, schools, buildings, health system (day/overnight cooling), registry/privacy, transit, labor rules, funding structure, timeline phases, and mixed reactions including canopy-target criticism.

Compression

Weight 15%
40

Over the 240-word limit, so it does not meet the required concision constraint despite being a good condensation otherwise.

Clarity

Weight 15%
85

Clear and readable, with good signposting of measures and concerns; slightly crowded due to length.

Structure

Weight 10%
88

Single coherent paragraph with logical progression from problem to actions, funding, timeline, reactions; somewhat long, which reduces flow.

Total Score

60

Overall Comments

Answer A provides comprehensive coverage of the source material, mentioning most key measures, trade-offs, funding, and timeline details. However, it exceeds the 240-word limit significantly (approximately 270 words), which is a notable structural violation. It reads more like a series of bullet points strung together than a single coherent prose paragraph, with each sentence covering a different topic without strong transitions. The language is neutral and informative, and it is largely faithful to the source. It does not add outside facts. However, it fails to mention the report's framing of heat as damage limitation rather than a complete solution, which is a notable omission of the source's key argument.

View Score Details

Faithfulness

Weight 40%
65

Answer A is largely faithful to the source but omits the report's important framing that adaptation is damage limitation rather than a complete solution. It also slightly oversimplifies some trade-offs. No outside facts are added, and the details mentioned are accurate.

Coverage

Weight 20%
70

Answer A covers the main problem, at least five distinct measures (reflective pavement, tree planting, bus stop retrofits, schoolyard conversion, building code revisions, cooling centers, transit improvements, worker safety plans), funding structure, timeline phases, and community concerns including exclusion worries, compliance burdens, and lack of canopy targets. However, it misses the damage limitation framing.

Compression

Weight 15%
45

Answer A exceeds the 240-word maximum significantly at approximately 270 words, which is a clear violation of the task requirements. While it does compress the source material, failing to meet the word count constraint is a notable weakness for a summarization task.

Clarity

Weight 15%
55

Answer A is understandable but reads somewhat like a sequential list of facts rather than a coherent narrative. Transitions between topics are weak, and the prose feels mechanical. The language is neutral and informative, but the flow could be significantly improved.

Structure

Weight 10%
45

Answer A is written as a single paragraph as required, but it feels like a series of loosely connected sentences rather than a cohesive prose paragraph. The organization follows the source sequentially rather than synthesizing information into a well-structured summary. The word count violation also reflects poor structural discipline.

Total Score

91

Overall Comments

Answer A is an excellent and comprehensive summary that meticulously follows the prompt's instructions. It successfully covers all the required components—problem, actions, funding, timeline, and concerns—in a clear and logical sequence. Its detailed breakdown of the six-year timeline is a particular strength. However, it focuses more on listing the factual components of the plan rather than capturing the overarching conceptual framework presented in the source text.

View Score Details

Faithfulness

Weight 40%
90

The summary is highly faithful to the factual details of the source text. It accurately represents all parts of the plan. It gets a very high score but misses the conceptual framing (e.g., "infrastructure and equity problem") that is central to the source's argument.

Coverage

Weight 20%
95

Coverage is outstanding. The answer successfully includes all the elements required by the prompt, providing a particularly strong and detailed breakdown of the six-year timeline, which was a specific requirement.

Compression

Weight 15%
90

The summary is very well-compressed, fitting a large amount of information into the word limit without sacrificing clarity.

Clarity

Weight 15%
90

The language is clear, neutral, and perfectly suited for a general public audience. The information is presented in a way that is easy to follow and understand.

Structure

Weight 10%
90

The paragraph has a very clear and logical structure, moving sequentially from the problem to the plan's components (streets, buildings, health), funding, timeline, and finally public reaction.

Marenton faces a growing heat crisis driven by rising temperatures and increasingly frequent hot nights that disproportionately affect lower-income inner districts with sparse tree cover and aging infrastructure. A joint city proposal addresses this as both an infrastructure and equity problem, recommending a layered response across multiple systems rather than isolated technological fixes. The plan targets eight heat-vulnerable districts identified through temperature mapping, health data, and demographic analysis, though critics worry this may neglect other neighborhoods. Key measures include redesigning street...

Show Full Answer

Marenton faces a growing heat crisis driven by rising temperatures and increasingly frequent hot nights that disproportionately affect lower-income inner districts with sparse tree cover and aging infrastructure. A joint city proposal addresses this as both an infrastructure and equity problem, recommending a layered response across multiple systems rather than isolated technological fixes. The plan targets eight heat-vulnerable districts identified through temperature mapping, health data, and demographic analysis, though critics worry this may neglect other neighborhoods. Key measures include redesigning streets with reflective surfaces and expanded tree planting, retrofitting bus stops with shade and water access, converting paved schoolyards into shaded gardens, upgrading building insulation and installing cool roofs, and establishing a tiered network of cooling centers in libraries and schools. The proposal also addresses public health through a voluntary registry for vulnerable residents, worker heat-safety standards in procurement rules, and transit improvements prioritizing air-conditioning repairs. The estimated 420 million currency unit cost would be funded through capital budgets, national climate grants, green bonds, and utility partnerships, implemented over six years with annual evaluations. Trade-offs include concerns about reflective surface glare, maintenance burdens, potential rent increases from building improvements, and data privacy risks from health registries. While residents in affected districts welcome recognition of their heat struggles, some neighborhoods fear exclusion, landlords worry about compliance costs, and fiscal conservatives question reliance on uncertain grant funding. The proposal frames heat adaptation as damage limitation rather than a complete solution, emphasizing that it complements rather than replaces emissions reduction efforts.

Result

#1 | Winner

Winning Votes

3 / 3

Average Score

85
Judge Models OpenAI GPT-5.2

Total Score

91

Overall Comments

Faithful, well-compressed, and clearly written one-paragraph summary that meets the word-limit requirement while preserving the main problem, cross-sector actions, funding approach, timeline, key trade-offs, and community/political concerns. It includes many distinct measures and maintains neutral language, with only minor loss of some finer-grained timeline details compared with Answer A.

View Score Details

Faithfulness

Weight 40%
95

Accurately captures the central argument and most specified measures, trade-offs, and framing (layered response; adaptation not a substitute for emissions cuts) without introducing unsupported claims.

Coverage

Weight 20%
88

Covers all core components and multiple measures across sectors, plus major trade-offs and public concerns; slightly less specific on phased year-by-year actions and some institutional details (e.g., backup power, transport assistance).

Compression

Weight 15%
86

Fits the required length range and stays concise while retaining essentials; strong density without feeling overly truncated.

Clarity

Weight 15%
90

Very clear, neutral, and accessible to a general audience; good phrasing of trade-offs and scope.

Structure

Weight 10%
91

Well-organized single paragraph with smooth progression and strong coherence, balancing problem, actions, funding/timeline, and concerns.

Total Score

71

Overall Comments

Answer B is well-organized, coherent, and stays closer to the word limit (approximately 240 words). It reads as a genuine prose paragraph with logical flow and transitions between topics. It captures the main problem, proposed actions, trade-offs, timeline, funding approach, and community concerns effectively. It includes at least five distinct measures and preserves the source's framing of heat adaptation as damage limitation rather than a complete solution. The language is neutral and informative. It avoids copying long phrases from the source. One minor weakness is that some details (like the specific timeline phases in years 1-3 vs 4-6) are slightly less detailed than in Answer A, but the overall quality of synthesis is higher.

View Score Details

Faithfulness

Weight 40%
75

Answer B accurately represents the source material and importantly includes the report's framing of heat adaptation as damage limitation rather than a complete solution, which is a key argument of the source. It correctly characterizes the problem as both an infrastructure and equity issue. No outside facts are added.

Coverage

Weight 20%
70

Answer B covers the main problem, at least five distinct measures (reflective surfaces, tree planting, bus stop retrofits, schoolyard conversion, building upgrades/cool roofs, cooling centers, worker safety standards, transit improvements), funding, timeline, and trade-offs including glare, maintenance, rent increases, privacy, exclusion, and fiscal uncertainty. It also captures the damage limitation framing and the equity dimension.

Compression

Weight 15%
65

Answer B is approximately 235-240 words, staying within or very close to the required 180-240 word range. It achieves good compression of the lengthy source while preserving key information and the overall argument structure.

Clarity

Weight 15%
70

Answer B reads as a coherent, well-flowing prose paragraph with logical transitions between topics. The language is neutral and informative, and the summary moves naturally from problem to solution to trade-offs to community reaction. It is accessible to a general public audience.

Structure

Weight 10%
70

Answer B is well-structured as a single coherent prose paragraph. It moves logically from the problem statement to the proposed solution, then to specific measures, funding, trade-offs, community reactions, and the report's broader framing. The organization reflects genuine synthesis rather than sequential summarization.

Total Score

93

Overall Comments

Answer B is an outstanding summary that not only covers the essential facts of the proposal but also captures its underlying philosophy. By framing the issue as an "infrastructure and equity problem" and noting the plan's goal of "damage limitation," it provides a deeper, more insightful condensation of the source material. While slightly less detailed on the specific phases of the timeline, its superior compression and more analytical structure make it a more sophisticated and faithful summary of the source's core argument.

View Score Details

Faithfulness

Weight 40%
95

This answer is exceptionally faithful, capturing not only the factual components of the proposal but also its core conceptual framing, such as identifying the issue as an "infrastructure and equity problem" and the plan's philosophy as "damage limitation."

Coverage

Weight 20%
90

Coverage is excellent, touching on all the required elements. However, its description of the timeline ("implemented over six years with annual evaluations") is less detailed than Answer A's phased breakdown, representing a minor omission.

Compression

Weight 15%
95

Compression is outstanding. The summary is particularly effective at grouping related concepts, such as listing several distinct trade-offs and concerns in a single, concise sentence, which is a very efficient use of space.

Clarity

Weight 15%
90

The summary is very clearly written and accessible. It uses precise language from the source (like "layered response") that adds to clarity without becoming jargon.

Structure

Weight 10%
95

The structure is excellent and slightly more sophisticated than A's. It effectively "bookends" the summary by introducing the problem's framing at the beginning and concluding with the plan's overall philosophy, creating a highly cohesive narrative.

Comparison Summary

Final rank order is determined by judge-wise rank aggregation (average rank + Borda tie-break). Average score is shown for reference.

Judges: 3

Winning Votes

0 / 3

Average Score

78
View this answer

Winning Votes

3 / 3

Average Score

85
View this answer

Judging Results

Why This Side Won

B is the winner because it provides a more insightful and faithful summary of the source text. While both answers are excellent and cover the required factual points, B goes a step further by capturing the conceptual framing of the proposal—identifying the problem as one of "infrastructure and equity" and the solution as "damage limitation." This demonstrates a deeper comprehension of the source material. This advantage in faithfulness, the most heavily weighted criterion, outweighs A's slightly more detailed coverage of the timeline.

Why This Side Won

Answer B wins primarily because it is more faithful to the source's overall argument (including the damage limitation framing), better structured as a coherent prose paragraph with logical flow, and adheres more closely to the word count constraint. While Answer A covers slightly more granular detail, it exceeds the word limit and reads more like a list of facts than a cohesive summary. On the most heavily weighted criterion (faithfulness at 40%), Answer B edges ahead by capturing the source's key framing. On structure (10%) and clarity (15%), Answer B is notably stronger. Coverage is roughly comparable, with Answer A having a slight edge in granular detail but Answer B capturing the broader argument better.

Judge Models OpenAI GPT-5.2

Why This Side Won

Answer B wins because it satisfies the required 180–240 word constraint while remaining highly faithful and covering the proposal’s key measures, funding, timeline, trade-offs, and community concerns. Answer A is similarly faithful and even more detailed, but it fails the word-limit requirement, reducing its overall benchmark quality under the weighted criteria.

X f L