Agreements Between Public Health Organizations and Food and Beverage Companies: Approaches to Improving Evaluation

Jean L. Wiecha; Mary K. Muth

doi:10.3768/rtipress.2021.op.0067.2101

Introduction

Substantial research implicates poor diet as a major factor contributing to noncommunicable diseases like diabetes, cardiovascular disease, stroke, and some types of cancer (Afshin et al., 2019; Mokdad et al., 2018). The food and beverage supply chain, including commodity suppliers, manufacturers, distributors, and retail, has a dominant role in improving food and beverage healthfulness in the marketplace through reformulating or introducing new products and changing retail, pricing, and promotion practices. One strategy for encouraging changes in the food and beverage supply chain is developing voluntary industry agreements between industry and public health partners (e.g., nongovernmental organizations [NGOs]). Recent examples in the United States include the Healthy Weight Commitment, in which 16 of the leading consumer packaged food and beverage manufacturers committed to selling 1.5 trillion fewer calories from 2007 through 2012 (Mozaffarian, 2014; Ng et al., 2014; Ng & Popkin, 2014), and the Balance Calories Initiative, in which the top three beverage companies pledged to reduce calories consumed from sugar-sweetened beverages by 20 percent from 2015 through 2025 (Cohen et al., 2018).

Voluntary industry agreements promote specific changes in product formulation, marketing, or distribution negotiated between industry and a public health partner. As such, they usually require participation in monitoring and evaluation. These agreements can be considered a type of industry self-regulation, but transparency, meaningful benchmarks, oversight, and objective evaluation are needed to ensure that they have a true public health impact (Sharma et al., 2010). Guiding principles developed for establishing agreements and partnerships that promote healthy food environments cite the need for accountability through monitoring and evaluation (Kraak, 2014; Kraak & Story, 2015). In addition, Kreslake & Sawyer (2017) developed a series of recommendations for evaluating voluntary industry agreements that called for more standardization and rigor in design, more focus on outcome measurement, and improvements in measurement quality.

Understanding the potential public health benefits of voluntary industry agreements has often been limited because most evaluations of these agreements focus on process evaluation involving activity verification. In a related paper, we assessed 20 peer-reviewed review articles and commentaries on voluntary industry agreements and 15 peer-reviewed evaluations of voluntary industry agreements (Wiecha & Muth, 2021). This assessment indicated that industry partners have a strong influence on agreement designs, which often weakens public health aims. Process evaluations, which are typically all that can be accomplished given evaluation budgets, have shown that some commitments have been implemented with fidelity to plans (e.g., reduction in sugary beverage calories shipped to schools, as noted in Wescott et al., 2012). However, others have shown limited changes, like stocking vending machines accessible to primary school students with healthy foods (Royo-Bordonada & Martínez-Huedo, 2014). Overall, most evaluations were conducted using designs that limited the understanding of attribution and causality of effects of the agreements on public health outcomes.

Interviews with 15 food and beverage companies, evaluation firms, and foundations that fund evaluations reinforced the fact that although industry agreements were usually driven by specific public health concerns, evaluation typically focused on implementation rather than public health outcomes (Wiecha & Muth, 2021). In addition, third-party evaluation lends credibility to industry agreements, but limited access to data, limited budgets, and industry partner influence reduce its effectiveness (Wiecha & Muth, 2021).

This paper builds on the findings in Wiecha & Muth (2021), Kreslake & Sawyer (2017), and Kraak (2014) to provide guidance on approaches and methods for improving the evaluation of voluntary industry agreements. Although we focus on evaluation of voluntary industry agreements that involve a public health organization working with a company to establish an agreement, most of the methods are also applicable to evaluations by researchers of initiatives that companies pursue independently. Our recommended steps provide information to public health researchers that could help strengthen agreement design and improve evaluation objectivity and rigor. This information is also useful to public health organizations that broker industry agreements and therefore must make decisions about evaluation goals as well as to the organizations that fund evaluations and therefore must make decisions about budgets to allocate for evaluation.

Developmental Evaluation

Evaluators should be engaged from the planning phases of each agreement and moving forward. In developmental evaluation, an evaluator helps with project design, focusing on developing appropriate goals and objectives. Engaging evaluators is particularly useful in dynamic, adaptive, and complex settings like the food industry. Developmental evaluation can help ensure that the agreement will meet partners’ needs and have public health value and that the implementation plan is appropriately sized and achievable. Furthermore, it can identify potential unintended consequences and stimulate thinking on mitigation strategies. Developmental evaluation is typically an iterative process; it includes background research and structured, ongoing communication with goals of building strategy based on real-time information. In this way, stakeholders—NGO partners, businesses, funders, and expert advisors—can identify and adapt efficient solutions in different contexts (Preskill & Beer, 2012). Additional stakeholders include government agencies and foundations that focus on public health goals.

Table 1 lists examples of questions that developmental evaluation activities could answer. Two outputs that can address these questions are industry profiles and logic models. Industry profile reports are based on publicly available data. They are most useful if completed before developing a contractual commitment but would still be useful if developed later because they can provide a 360-degree look at the industry in which a potential business partner operates. Industry profiles typically describe (1) the supply side of the industry (i.e., manufacturers, distributors, retailers, and foodservice), (2) the demand side (i.e., consumers as individuals or households, including demographic information and knowledge, attitudes, and behaviors), and (3) external factors affecting the industry such as regulations and general trends.

Table 1.Questions that can guide design and evaluation of industry initiatives

Phase	Guiding Questions
Developmental Evaluation	Setting: What industries and companies have strong potential for meaningful action and partnerships? Which are ready for engagement and change? Goals: What are reasonable end points for a collaboration? Is the aim formative (e.g., establishing trust among partners or implementing a pilot), outcome oriented (e.g., improving availability of healthful products), or impact oriented (e.g., improving dietary quality)? What science-based nutrition, health, health equity, and environmental goals are partners interested in addressing or achieving? What are the business goals? Outcomes: What activities or agreement elements are appropriate for the goals and are feasible and measurable?
Evaluation Planning	What is the perceived evaluability of the initiative based on whether data are available or could be collected to measure the intended outcomes? Is there interest in and funding for conducting a process evaluation, outcome evaluation, or both? Is there interest in and funding for developing qualitative, quantitative, or a combination of measures? What are existing data sources than can be used for the evaluation, and what new data collection is required? Will the design permit evaluators to attribute improvements in the intended outcomes to the initiative? What structures are in place to protect evaluator independence and objectivity? What are the potential unintended consequences of the initiative, and can they be assessed in the evaluation?
Process Evaluation	Did the initiative achieve its objectives, and was it implemented with fidelity to design? To what extent was it implemented as planned, and in what contexts or settings did the best implementation fidelity occur? What was the industry partner’s perspective on feasibility, acceptability, and fidelity during the implementation?
Outcome Evaluation	What effect did implementation have on availability and accessibility of healthier food and beverage options in the marketplace? What effect did implementation have on marketing and promotion strategies, events, and imprints? What effect did implementation have on consumer purchase and consumption behavior? Were there any adverse consequences, such as compensatory marketing of unhealthy products in other markets or environmental impacts?
Impact Evaluation	What effect did implementation have on intake of the targeted foods and beverages? What effect did implementation have on diet quality of the target population? What effect did implementation have on weight status of the target population, and are the changes sustainable? What effect did implementation have on reducing the incidence of noncommunicable diseases such as diabetes and cardiovascular disease in the target population?

A more in-depth profile could capture information that will help determine if a potential corporate partner is a good fit for a voluntary industry agreement—that is, whether the partner has the potential to truly ally with the public health partner’s goals. In addition to describing the structure of the industry, evaluators can use profiles to estimate the potential outcomes of an industry initiative in terms of the relative market shares of the affected products, businesses, or stages of the supply chain. Industry initiatives that affect greater market shares may have more potential public health impact, but they may also carry larger risks to the business partner because they will likely have larger impacts on business operations. The profile can increase transparency during the partnership-building process by describing corporate social responsibility activities, core product healthfulness, areas of corporate self-interest, and industry or company practices that undermine population or environmental health. For example, corporate social responsibility activities may demonstrate intent to improve health, whereas other corporate activities may appear to promote unhealthy foods or beverages. The toolkit provided in Kraak (2014) for assessing partnership opportunities and challenges can also be used to help guide the development of an industry profile. The Access to Nutrition Index-US, which scored a selection of multinational food companies on health-related dimensions of corporate policy, may also be an additional relevant source when developing an industry profile (Access to Nutrition Foundation, 2018). The factors used to rate companies for the Access to Nutrition Index-US—such as information on corporate strategy, production formulation and nutrition labeling commitments, product and brand promotion practices, and product accessibility (Sacks et al., 2019)—could provide relevant context.

The second output is a heuristic consisting of a logic model and a theory of change for a specific initiative. If possible, these should be informed by the industry profile. The logic model should clearly identify inputs, activities, outputs, and anticipated outcomes and impacts (if any) of the agreement (US Department of Health and Human Services, Centers for Disease Control and Prevention, Office of the Director, Office of Strategy and Innovation, 2011). A simple visual representation and a more complex version with additional descriptive language may both be necessary to ensure clarity and foster consensus. Although logic models help visualize what will happen under ideal circumstances—that is, a causal sequence—a theory of change addresses how. It describes the context and assumptions, establishes the causal pathway from interventions to outcomes, and identifies challenges or potential bottlenecks that could be addressed proactively (Taplin & Clark, 2012). The evaluator should work closely with the public health partners and participating companies to develop the logic model and theory of change that best represent the agreement at the outset, and the team should use these documents to devise their workplans. Going forward, they should revisit and revise them as circumstances dictate.

Process, Outcome, and Impact Evaluations

During the developmental phase, industry and public health partners reach consensus on the specific activities and goals of an agreement. Next, the evaluator can work with them to identify the resources available for evaluation, including budget and data, and proceed toward developing an evaluation design and plan. Partners should adopt the evaluation plan long before implementation begins to permit baseline data collection. Establishing data availability and evaluator access is of primary importance. In this section, we define process, outcome, and impact evaluations in the context of voluntary industry initiatives and subsequently discuss design, data collection, and data sources.

Process Evaluation

All voluntary industry initiatives should include a process evaluation to assess the extent to which objectives were met (e.g., all retail locations posted informational displays, changes in product formulation or pricing were made). Questions addressed in a process evaluation, as shown in Table 1, can assess how much implementation occurred and under what circumstances, as well as fidelity, feasibility, and acceptability on multiple dimensions, including cost, to help with decisions about scaling and sustainability. Process evaluations can focus on outputs listed in the logic model, framed as measurable, time limited, and, if possible, quantitative objectives. Supplemental qualitative information can shed light on implementation processes, successes, and challenges to enable quality improvement and scaling. Because outcomes cannot be attributed to an intervention that was not implemented, process evaluations are necessary for effective attribution.

In many industry initiatives, process evaluation has been the final measurement step, often because of limited data access or insufficient evaluation budgets. In some circumstances, a terminal process evaluation may be sufficient; it may help a new industry partner feel comfortable with the initiative and with the evaluation process, it may confirm feasibility, or it may provide quality improvement data. An important role of a terminal process evaluation is to examine questions about implementation. Not all agreements are structured to achieve significant changes in the marketplace, and those that have more formative goals such as proof of concept, feasibility testing, and trust-building can still have value. Measuring these end points could clarify whether agreements have benefits related to increasing entry points into the healthy food movement for industry partners who are new to the table.

Outcome Evaluation

The focus on process evaluation and limited outcomes measurement has contributed to criticism that voluntary industry agreements lack evidence of effectiveness. Outcome evaluations assess what occurred as a result of implementation and illuminate whether activities led to healthful changes in the marketplace and in consumer attitudes and choices. Outcome evaluations may rely on a mix of quantitative and qualitative data (e.g., document reviews, environmental scans, and interviews) that respond to questions such as those listed in Table 1.

Outcome evaluations require baseline and follow-up data that include sales, marketing, and distribution data. Industry partners may have some of this information if they choose to make it available to evaluators, but they may have restrictions on the level of proprietary information they can share. Other data sources exist, including using sales and marketing data that can be purchased from commercial data companies; conducting primary data collection, including surveys of consumers and businesses and environmental scans (i.e., broadcast media time sampling data); and obtaining government data. These datasets each have some limitations but with appropriate caveats, their use would expand knowledge about how voluntary industry initiatives affect the marketplace. We discuss their use below under Data Sources and Uses.

Impact Evaluation

Impact in health research typically refers to health-related effects of outcomes. In theory, if population reach for voluntary industry initiatives is sufficient, population-level surveillance data that measure diet quality, health status, and weight status could be used to identify improvements over time. Impact evaluations typically require rigorous designs that address change over time or change relative to a comparison group. However, attribution to specific industry initiatives will likely remain difficult, if not impossible, because of the numerous other factors affecting producer and consumer behavior over time. Although there is a low likelihood that a single voluntary industry initiative would merit impact evaluation, we list potential impact evaluation questions in Table 1.

Data Sources and Uses

Many evaluations of industry initiatives rely on data from the industry partner to assess implementation. However, other data sources, such as primary data collection or purchased scanner and other sales data, are alternatives, but typical evaluation budgets will need to increase to cover the costs. Table 2 lists possible measurement methods that can be used for evaluating industry initiatives stratified by main sources of data (e.g., company- or organization-provided data, collected data, purchased data, government data) and indicates the potential applicability of each measurement method to process, outcome, and impact evaluation. The types of organizations that could enter into an industry initiative include manufacturers, grocery retailers, restaurants, and other entities that offer foods and beverages for purchase or consumption. Thus, the specific data that can be obtained from records, collected through primary means, and purchased will vary and affect what can be measured. Companies may be reluctant to share extensive data with external evaluators because of concerns that it could compromise their competitive position, in which case primary and purchased data may be preferred alternatives, assuming the evaluation budget is sufficient. Analysis methods will vary depending on the nature of the data, ranging from descriptive summaries of qualitative data, to tabulations and statistical testing of quantitative measures, to regression-based modeling approaches. As noted in Table 2, government surveillance data can be used for impact evaluation but in most cases, the anticipated effects of industry initiatives are not expected to be large enough to detect measurable change in specific geographic areas or nationally.

Table 2.Possible measures and data sources for evaluation of industry initiatives

Source of Data	Measurement Methods	Applicability of Measurement Method
Source of Data	Measurement Methods	Process Evaluation (or Verification)	Outcome Evaluation^a	Impact Evaluation^b
Company-provided records (e.g., from manufacturers, stores, restaurants, vending machines)	Product formulation, packaging, and labeling records	✓	✓
	Menu offerings and nutrition information	✓	✓
	Product advertising, including paid and owned media (media type, volume, expenditures)	✓	✓
	Product shipment records (destination, quantity, attributes)	✓	✓
	Product purchase records (source, quantity, attributes)	✓	✓
	Product sales records (quantity, attributes, prices)	✓
	Store planograms	✓	✓
	Employee health and wellness program documentation	✓
	Website content (initiatives, education, calculators)	✓
Organization-provided records (e.g., from schools, day care, community-based organizations)	Product purchase records (source, quantity, attributes)	✓	✓
	Menu records	✓	✓
	School policy documentation	✓	✓
	Employee health and wellness program documentation	✓	✓
	Website content (initiatives, education, calculators)	✓	✓
	Federal program participation records	✓	✓
Primary data collection	Site visits (schools, NGOs)	✓
	In-depth stakeholder interviews	✓
	Consumer focus groups		✓
	Consumer surveys (telephone, mail, intercept, social media, mail, opt-in web panel, nationally representative web panel)		✓
	Industry surveys (multiple modes)		✓
	Organization surveys (multiple modes)	✓	✓
	Store and restaurant audits (offerings, pricing, signage)	✓	✓
Commercial data (available for purchase)	Store scanner data (sales, prices)		✓
	Household scanner data (purchases, prices; with demographics)	✓	✓
	Label data (nutrition facts, ingredients, claims)	✓	✓
	Consumer food purchase diary data (foods consumed, food description, source of food; with demographics)		✓
	Menu data (offerings, prices, nutrition)	✓	✓
	Product advertising, including paid, owned, and earned media data (media type, volume)	✓	✓
Government data	Consumer Expenditures Survey (purchase value)		✓
	NHANES dietary recall (consumption)		✓	✓
	NHANES anthropometric measures (height, weight)			✓
	NHANES survey (attitudes, behaviors, food spending)		✓	✓
	BRFSS (self-reported height, self-reported weight, food frequency)		✓	✓
	NHIS (self-reported height, self-reported weight, health status)			✓
	YRBSS (self-reported height, self-reported weight, food frequency)		✓	✓

BRFSS = Behavioral Risk Factor Surveillance System; NHANES = National Health and Nutrition Examination Survey; NHIS = National Health Interview Survey; YRBSS = Youth Risk Behavior Surveillance System.
^a Outcome measures relate to awareness, understanding, attitudes, availability, affordability, accessibility, reformulation, labeling, marketing, pricing, purchases, and consumption.
^b Impact measures relate to health status and weight status.

Execution Decisions

Some of the decisions that evaluators must consider in executing an evaluation are shown in Table 3. Most industry agreements are not structured to accommodate experimental designs, but quasi-experimental designs can yield evidence of cause and effect if appropriately planned with respect to selection of end points, sample size, data quality, data collection, and data analysis. Well-executed time-series analyses that include sufficient baseline data are generally the best option available to evaluate implementation and outcomes of voluntary industry agreements. Both time-series and pre-post designs with comparison groups will provide better evidence than pre-post designs without comparison groups or post-only designs. An evaluator may want to identify the strengths and weaknesses of different levels of approaches (e.g., offering bronze, silver, gold standard options) that fit the goals and funding for the evaluation and discuss how to optimize design selection with business and public health partners.

Table 3.Evaluation design and execution decisions

Type of Decision	Examples
Sampling strategy	Selection of localities, businesses, products, households, or other observational units for assessing impacts
Timing	Time period and frequency of data collection relative to the initiative’s timeline, including capturing baseline data before initiation
Comparisons	Pre-post, time-series, or control group comparisons
Triangulation	Whether results can be triangulated across multiple sources to strengthen validity and provide qualitative context in interpreting quantitative findings

Conclusion

Evaluators can help ensure that voluntary industry initiatives for improving food and beverage healthfulness are operating as intended, having the desired effect on food industry practices and leading to meaningful outcomes. To do this, evaluators must provide objective, data-driven input from the outset of the planning process through the end of the project. Developmental input can include helping public health partners identify suitable industry partners and set measurable, realistic, and meaningful objectives for implementation and outcomes. Doing so helps ensure that public health partners support agreements that can have a true public health impact and avoid those that merely provide companies with talking points about their engagement in public health initiatives. Subsequently, evaluators can design appropriate process and outcome evaluations that measure progress and assess barriers and facilitators to change; measure implementation; and assess actual outcomes and impact on sales and consumer knowledge, attitudes, and behavior. In addition to ensuring that voluntary industry agreements are operating as intended, more comprehensive and rigorous evaluation could provide information to inform the development of future agreements to improve public health impact. However, improved access to data and larger evaluation budgets will be needed to facilitate these efforts.

Acknowledgments

Support for this research was provided by the Robert Wood Johnson Foundation (Grant No. 74483). The views expressed here do not necessarily reflect the views of the Foundation. We also acknowledge the advice and support of Priya Gandhi, Victoria Brown, and Tina Kauh at the Foundation.