Structured Prompts, Better Outcomes? A Deep Dive into Structured ChatGPT Interfaces in Graduate Robotics Education(2025)

2021-11-03 19:00:56

126

1. Introduction

As Large Language Models (LLMs) like ChatGPT continue to gain prominence in education, questions are emerging not just about if these tools can help learning, but how they should be used to best support it. LLMs offer students an unprecedented opportunity to access real-time assistance, explore complex problems, and iterate on ideas quickly. Yet, the very openness and flexibility that make LLMs appealing can also become barriers to effective use, especially when students lack the skills to engage productively.

Prior research has shown that LLM effectiveness depends heavily on how students prompt them. Just as the quality of a search engine query affects the usefulness of the results, so too does the quality of an LLM prompt determine the relevance and educational value of its responses. This raises a crucial question for educators: Can structured prompting interfaces help students learn to interact more effectively with LLMs, and thereby learn more effectively overall?

This article explores that question in depth, drawing from a recent study conducted with 58 students in a graduate robotics course. By evaluating a structured interface designed to encourage effective LLM use, the study provides insight into how interface design, prompting behavior, and learner perceptions interact in the context of advanced technical education.

2. Research Context and Motivation

In many academic settings, students are beginning to use LLMs for a wide range of tasks: drafting essays, debugging code, synthesizing research, and more. However, educational researchers warn that without guidance, LLM use can become superficial or even misleading. Some students treat LLMs as "answer machines" rather than tools for thinking, leading to surface-level engagement that doesn't promote deep learning.

The structured interface examined in this study was designed to counter this trend by scaffolding students’ prompting behavior. The idea was to guide them toward better prompt formulation, including the use of:

Clear problem definitions
Requests for step-by-step reasoning
Explanations of code behavior
Reflections on the LLM’s outputs

The study's key research questions were:

Does structured prompting influence student behavior when using LLMs?
Does it affect their learning or performance?
Do students continue these prompting strategies when the structured interface is removed?
How do students perceive the usefulness and relevance of the structure?

3. Methodology

Participants and Study Design

The study involved 58 students enrolled in a graduate-level robotics course at a major university. Students were randomly assigned to one of two groups:

Intervention Group: Used a structured GPT-based interface for two practice lab sessions.
Control Group: Used ChatGPT with no restrictions.

In the third lab session, both groups were allowed to use ChatGPT freely.

Data Sources

The researchers collected and analyzed:

Prompt logs: To track behavioral differences in how students used ChatGPT
Pre- and post-tests: To assess conceptual learning gains
Task scores: To evaluate lab performance
Surveys: To understand student attitudes and perceptions

4. Key Findings

A. No Difference in Performance or Learning Outcomes

Perhaps the most surprising result was that structured prompting did not lead to statistically significant differences in learning gains or lab task scores between the groups. Both groups performed similarly on coding tasks and on pre-post learning assessments.

This suggests that simply altering the prompting interface may not be sufficient to produce measurable academic benefits—at least not over the short term or across a limited number of sessions.

B. Structured Prompting Encouraged More Effective Behaviors

Despite the lack of performance differences, the log data revealed that students in the structured group exhibited more “productive” prompting behaviors:

They wrote longer, clearer prompts
They asked more frequently for explanations or justifications
They engaged in more iterative interactions (e.g., refining prompts, asking follow-up questions)

Crucially, these behaviors correlated with higher learning gains, even though they weren’t always adopted by the control group.

C. Behavior Did Not Transfer After Removing the Structure

Once the structured interface was removed, students reverted to shorter, less thoughtful prompts—even those who had previously used the structured system. The positive behaviors didn’t seem to "stick" when students were left to their own devices.

This points to a limitation of bottom-up design interventions: while structured interfaces can shape behavior during use, they may not build lasting habits unless paired with other forms of motivation or instruction.

D. Mixed Student Perceptions

Survey responses revealed a mixed reception to the structured platform:

A subset of students appreciated the structured approach, noting that it helped them think more clearly and use ChatGPT more effectively.
However, most students found the structure constraining or irrelevant, especially when they already had habits formed around how they typically use LLMs.

Many students saw ChatGPT primarily as a productivity or debugging tool—not as a learning partner—which shaped their resistance to new prompting styles.

5. Discussion

Structured Interfaces Can Guide, But May Not Transform

This study illustrates a broader challenge in educational technology: providing scaffolds that encourage best practices without alienating users. The structured interface successfully nudged students toward better behaviors while it was in place, but these behaviors did not persist afterward.

This suggests that temporary or bottom-up changes to interfaces—while useful for exploratory studies—may be insufficient to drive long-term change in educational habits.

Prompting Quality Still Matters

One of the most important contributions of the study was confirming that how students prompt matters more than whether they use ChatGPT at all. Students who wrote thoughtful prompts and asked for understanding-oriented outputs gained more from their interactions.

This is consistent with broader theories of learning through explanation, metacognition, and elaborative interrogation.

Design Implications: Combine Structure with Motivation

If we want students to adopt better LLM usage habits, we may need to combine structural nudges with top-down strategies, such as:

Teaching students why certain prompting behaviors are effective
Demonstrating examples of productive and unproductive prompting
Incentivizing good prompting in grading rubrics or classroom feedback

Just as students learn how to write good research questions or debug code methodically, they may also need to be explicitly taught how to “collaborate” effectively with an LLM.

6. Recommendations for Educators

Educators hoping to incorporate LLMs into STEM or computer science curricula should consider the following:

Integrate Prompting Instruction: Treat prompting as a skill worthy of instruction and practice, just like debugging or design thinking.
Use Structured Prompts as Scaffolds, Not Crutches: Structured interfaces can be useful for beginners but should transition into open-ended practice with feedback.
Model Productive Prompting Behavior: Instructors should model how to use LLMs to think rather than just solve.
Assess LLM Usage Quality: Grading criteria or reflection assignments can include how well students engage with LLMs, not just final outputs.
Foster Metacognition About LLMs: Encourage students to reflect on how they used ChatGPT, what worked, and what didn’t.

7. Limitations and Future Directions

The study had several limitations:

Short duration: A longer-term study might capture habit formation better.
Small sample size: 58 students limits generalizability.
Context specificity: Results may not transfer to non-technical fields or undergraduate students.

Future research could:

Explore the impact of explicit instruction in prompting, not just structured interfaces.
Investigate how different motivational frames (e.g., emphasizing mastery vs. productivity) shape LLM engagement.
Examine peer-based prompting activities, where students evaluate each other’s interactions with LLMs.

8. Conclusion

This study adds nuance to the conversation around LLMs in education. While structured prompting interfaces can shape behavior temporarily, they may not create lasting change unless students are also motivated and explicitly taught why these strategies matter.

In short, structured prompts may lead to better outcomes—but only when embedded within a broader educational strategy that values thoughtful engagement, reflective learning, and a clear understanding of how to collaborate with intelligent systems.

ChatGPT

hat Shapes User Trust in ChatGPT?

The Narrative Construction of Generative AI Efficacy by the Media: A Case Study of the Role of ChatGPT in Higher Education