A diverse group of computer science students working collaboratively on laptops in a modern university library, some looking at screens displaying code or text documents, hinting at the integration of technology in their writing process, 35mm portrait lens, depth of field.

Cracking the Code: How LLMs Are Reshaping Computer Science Student Writing

Hey there! Let’s talk about something that’s been buzzing around universities lately, especially in the world of computer science: Large Language Models, or LLMs. You know, tools like ChatGPT and the like. If you’re a student, or you teach students, or frankly, if you just exist in the modern world, you’ve probably noticed they’re everywhere. And guess what? They’re totally shaking things up, particularly when it comes to writing.

The Writing Hurdle in Computer Science

So, the latest buzz from the big shots in computer science education (think ACM and IEEE-CS) is that students need to write more. Specifically, they’re pushing for things like white papers in a bunch of courses. Now, this sounds great on paper, right? Get those future tech wizards thinking critically and articulating their ideas. But here’s the kicker: writing isn’t always our strong suit. Researching? Articulating thoughts clearly? Following specific formats? Checking if what you’ve written actually makes sense and is accurate? Yeah, these can be real challenges for students who are maybe more comfortable with code than prose.

This is where LLMs waltz in. They *could* be awesome allies, right? Like having a super-powered brainstorming buddy or a structure guru. They can kickstart research ideas, offer inspiration, and help frame your points. But, and it’s a big but, they can also make things way too easy, leading to shallow work and, let’s be honest, spreading stuff that isn’t checked or even true. It’s a bit of a tightrope walk.

Our Little Experiment

To figure out how to navigate this, especially for students just starting out with research writing, we decided to run an experiment. We wanted to help students build those crucial writing skills without penalizing the folks who were already pretty good at it the old-fashioned way. So, we cooked up a set of clear rules and strategies for writing these “briefing reports” (which are basically white papers), allowing students to use LLMs in different ways: fully, partially (what we called ‘Hybrid’), or not at all (‘No LLM’).

We embedded this whole system into our faculty’s learning management system (we use Moodle, if you’re curious). We rolled it out and evaluated it not once, but twice, over the 2023/24 academic year with over 150 students. This whole article is basically me sharing what we did, what we learned from the first go-round, and how we tweaked things for the second.

And the big takeaway? Based on these two carefully run studies, both teachers and students ended up feeling pretty positive about how LLMs can impact the quality of student reports, *when used correctly*. It feels like we might have stumbled onto a way to integrate AI-generated content that could actually be a solid model for others.

Why All the Writing, Anyway?

You might wonder, why are computer science folks suddenly so focused on “soft skills” like writing? Well, employers have been asking for it! They need their tech teams to communicate effectively. These skills started popping up in curricula years ago, often linked to professional practice or capstone projects. The latest curriculum guidelines explicitly call for writing white papers in several core areas, from algorithms to ethics.

White papers are more than just essays; they’re persuasive reports presenting info, analysis, and recommendations on a topic. They’ve been around forever, shaping policies in all sorts of fields. Now, CS students are expected to churn out a bunch of these during their degree. And yeah, for students new to research and formal writing, it’s a steep hill to climb. White papers need research, they need to be objective, follow a strict structure (intro, findings, analysis, recommendations, conclusion), and importantly, show the author’s critical thinking.

In our Computer Ethics course, which has been running for ages and aligns with industry guidelines, white papers felt like a perfect fit. Especially since many of our final-year students already have industry experience where they’ll likely need to write similar reports. We hoped this would really help them bridge the gap between academic knowledge and professional skills.

Even though we’ve taught this course for years, white papers were new for us too! So, we totally expected to make some mistakes, and we did. But talking to companies and getting student feedback helped us refine things for the next year.

A diverse group of computer science students working collaboratively on laptops in a modern university library, some looking at screens displaying code or text documents, hinting at the integration of technology in their writing process, 35mm portrait lens, depth of field.

How We Structured the Project

Our project was a team effort, with about 12 students per team. One team leader, 11 members. Everyone had roughly the same workload. The project had ten steps, designed to guide them through the process:

  • We presented the project topics and explained the rules for individual and group reports.
  • Students chose their teams (publicly visible, so they could group with friends).
  • Within the team, they picked a leader.
  • The leader defined specific sub-topics for members and sent them to us (the profs) for review. We’d send them back for tweaks if needed.
  • Once approved, the leader posted the topics, and members chose one. (Leaders didn’t write individual papers, their job was coordination).
  • Students wrote their individual reports and submitted them via a special quiz in Moodle that matched their chosen writing mode (No LLM, Hybrid, Only LLM).
  • Simultaneously, they posted their report and a couple of slides on the team’s private forum.
  • The team leader compiled the individual reports into a single group white paper and presentation.
  • Team members reviewed the group report and presentation.
  • Selected volunteers presented the group work publicly (earning bonus points!).
  • After presentations, we’d have a public discussion with questions, involving everyone. These discussions were part of the evaluation.

Interestingly, when we compared the topics proposed by team leaders to what LLMs suggested, about a third were pretty similar to AI-generated ideas. Students had about 20 days for their individual reports.

The Quizzes: Our Secret Weapon

The core of tracking how students worked was these three quizzes in Moodle. Students uploaded their reports and, crucially, showed their process based on their chosen mode:

  • No LLM Mode (The Traditionalists): Students listed three key research questions they explored, the sources they used for each, and copied key snippets from those sources that informed their writing.
  • Only LLM Mode (The AI Aficionados): Students stated which LLM they used. Then, they listed the three key questions (prompts) they asked the LLM, copied the AI-generated answers, and *then* had to find sources to *verify* the facts in the AI’s answers.
  • Hybrid Mode (The Collaborators): This was the recommended approach – a mix of human and AI. Students listed a combination (we suggested 50/50) of traditional research questions and LLM prompts. Like the other modes, they had to list sources for research questions and verify AI answers with sources.

At the end of all quizzes, students uploaded their report sections: introduction (explaining their approach), the main body (elaboration), and the conclusion (with their critical stance and recommendations). They also had to list all references properly.

We noticed students often used chatty language with LLMs (“What can you tell me about…”) but more formal queries with search engines (“Impact of X on Y”). Some students totally missed the point, asking LLMs to write the whole report or using the same query for both LLMs and search engines, not grasping the different ways they work. Asking LLMs to write the whole thing? Instant zero points. That wasn’t the goal.

A close-up macro shot of a computer screen displaying a complex learning management system interface with charts and data visualizations related to student performance and LLM usage, 60mm macro lens, high detail, precise focusing.

Grading and Keeping it Honest

Each part of the quiz (research questions/prompts, sources, report sections) was graded. Quality of questions/prompts, relevance of sources/verification, and report quality all factored in. Verbatim copying (plagiarism, including cross-language!) meant an immediate zero. Improper citation cost points. If no plagiarism was found, we added a multiplier based on:

  • Individual effort (based on document metadata and time spent on the quiz).
  • Originality (manual check comparing their work to sources/AI output).
  • Following report rules (word count, references).

Bonus points came from presenting and participating in public discussions. These face-to-face chats were awesome because they showed us that, regardless of *how* they wrote the report, students who participated seemed to have gained a similar level of understanding of the topic. It’s hard to fake knowing something when you’re discussing it live!

By the Numbers: Who Used What?

In the first project, 181 students were involved (15 leaders, 166 members). 150 students completed the quizzes. 14 submitted reports only via the forum (they got evaluated but reminded to use the quiz next time). Here’s the breakdown of how students chose to work in Project 1:

  • No LLM: 78 students (slightly over half)
  • Hybrid: 68 students
  • Only LLM: 6 students (very few!)

For the second project, 135 students completed the quizzes (6 only used the forum). The numbers shifted a bit:

  • No LLM: 74 students
  • Hybrid: 47 students
  • Only LLM: 13 students (more than double!)

Most students stuck with their initial choice, but some moved from No LLM to Hybrid, or vice versa. Interestingly, a few more brave souls jumped into the “Only LLM” pool for the second project.

When it came to *which* LLM, ChatGPT (the free version) was the clear winner, especially for the Hybrid approach. Microsoft’s Bing Chat was a distant second. Google’s Gemini wasn’t very popular yet. Most students, even non-native speakers, preferred using English with the LLMs, probably because the answers are generally better quality and, well, English is the lingua franca in CS.

A dynamic telephoto zoom shot capturing a group of computer science students engaged in a lively academic discussion during a presentation, with one student gesturing animatedly, conveying action and movement tracking, 100-400mm telephoto zoom, fast shutter speed.

Connecting the Dots: Results and Modes

We were super curious if the “better” students (based on midterm exam results) preferred a certain mode. The midterm results were pretty evenly distributed, but students who chose the No LLM approach had a slight edge on average. Their reports in the first project were also generally better than the Hybrid ones.

Our hypothesis? Maybe the students who chose traditional writing were just more diligent, more invested, and better at following instructions. They also seemed more eloquent and critical thinkers in discussions, suggesting they already had solid writing experience and didn’t feel the *need* for LLM help. Students who went for Hybrid seemed a bit more hesitant initially, maybe due to less research experience or difficulty adapting to the structured process.

However, things got interesting in the second project! While the midterm results correlation stayed similar, the project results improved significantly, especially for the Hybrid group. It seems the feedback from the first project really helped students understand the expectations and refine their approach. The scatter plots showed that most students improved their project scores in the second attempt. There was a correlation between project scores and exam scores, suggesting that diligence and understanding translate across different assessment types.

Student and Teacher Vibes

Our overall impression as teachers? Students *loved* using LLMs. They used them widely, especially for those initial research papers. The quizzes, where they had to show their work and verify AI answers, felt like a good way to prevent outright misuse while letting them explore the tools. We were a bit surprised that so many chose traditional writing initially, but the survey revealed many of them *still* used LLMs in the background for inspiration or quick info gathering, just not as their primary method shown in the quiz.

Students who used Hybrid appreciated combining traditional research with AI suggestions. Interestingly, some Hybrid users in the first project switched to traditional in the second, but even more moved towards Only LLM. This shift, especially among students with better midterm results, was unexpected but suggests they gained confidence in using AI effectively.

The second project’s strong results and relatively low cheating (which we checked thoroughly) encourage us to keep refining this approach.

We surveyed 112 students about their experience. Only 6 said they didn’t use LLMs at all, while 19 used them daily for reports! Their reasons were clear:

  • “It helps me get the info I need without me wasting time searching through Google.” (Speed!)
  • “I believe it’s great to produce a clear text and to reformulate an idea one can have. Mostly when English (or other languages) is not our native language.” (Drafting and language support!)
  • “I mainly use it by inputting finished blocks of text that I’m not entirely happy with, asking it to rewrite them ‘more formally’ or just differently. I never find the whole output useful by itself, but there are always some of the generated rephrasing that I like better than my own text.” (Refinement, not replacement!)

These comments totally align with our view that LLMs are great starting points. While a few students were against LLMs in education (worrying about creativity loss), the vast majority (68.75%) supported their integration. Almost everyone (over 75%) was firmly against using LLMs in exams, which we agree with!

Filling out the quizzes was a bit tricky for some initially (18.75% read them multiple times), but after feedback, many (25%) realized their mistakes and corrected them in the second project. A good chunk (28.57%) actually found the quizzes intuitive. The main complaint after Project 1 was that the instructions were too long. We shortened and clarified them for Project 2, and it clearly helped improve student performance.

We also noticed students consulted LLMs not just on their topic, but also on report structure. This helped them organize their thoughts and complete the quiz correctly. We plan to encourage this structural consultation even more next year.

A wide-angle landscape shot depicting a stylized representation of human hands and glowing artificial intelligence elements working together on a digital document, symbolizing the integration of AI in academic writing and the future of education, wide-angle 10mm, sharp focus.

Wrapping Up and Looking Ahead

LLMs are definitely here to stay in academia. They offer instant, focused answers, brainstorm ideas, summarize, and paraphrase – making them powerful tools for writing. They can even encourage critical thinking by showing multiple viewpoints. Students found they helped with reports and even suggested content, boosting their skills.

But yeah, the challenges are real: paper mills, AI “hallucinations” (making stuff up), LLMs not understanding context (like the “whistleblowing” example!), and especially plagiarism. Students copying AI output verbatim without citation was a big one, partly because they didn’t realize AI content counted as someone else’s work needing citation. Our quizzes, by requiring them to show sources and AI inputs, made detecting this *much* easier.

We really feel our structured approach helped maximize the benefits of LLMs while minimizing the risks to academic integrity. Requiring research questions/prompts and source verification kept students focused and ensured accuracy. The quiz structure itself guided them through the white paper format (intro, body, conclusion, references). This helped them build arguments based on evidence, not just guesswork.

We’re repeating this approach next year, with tweaks. We’ll make instructions even clearer (maybe demos instead of just text) and simplify the process. To boost participation in discussions (which were super valuable for confirming understanding), we plan to increase their weight in the final grade. If a student can competently discuss a topic, even if AI helped write the report, they’ve still learned!

Integrating LLMs thoughtfully is a huge step for modern education. Our quiz-based approach feels adaptable for various courses and sizes, easy to implement in different LMSs, and flexible. It’s not about banning AI; it’s about teaching students how to use it responsibly and effectively to become better researchers and writers. Universities, as pillars of progress, need to lead the way in figuring this out.

Source: Springer

Articoli correlati

Lascia un commento

Il tuo indirizzo email non sarà pubblicato. I campi obbligatori sono contrassegnati *