The Human Touch in Consulting and Business Analysis – Humans vs AI

At AnswerTeam, we are often asked the question, “do you use ChatGPT to answer your questions?”.

Our policy on the use of generative AI can be found here.

The answer is “yes”.  We use generative AI, we use Google, we use academic search engines, we use storyboarding techniques, we use strategic frameworks, we use human writers, we use human editors, and we make extensive use of creative analysis drawn from the individual experiences of the AnswerTeam team.

Those who see AI as replacing human consultants are most likely those who have had very little experience using AI.

Nonetheless, what is the value of human capital among clients who are increasingly adept at using generative AI?  In the same way that popular search engines have not eliminated the need for traditional consulting services – and in fact may have helped to drive management and strategy consulting as a service industry – generative AI will serve to enhance the human role in analysis, problem solving, and decision-making over the greatest challenges that businesses face.

Before getting into the value-added component of human-involvement in business analysis, this article takes a look at the fundamental question of whether or not there is a significant difference between AI-generated content and human-generated content, and references research which provides an indication.

Studies on Human Ability to Distinguish AI-Generated vs Human-Written Text

Several surveys and studies have examined how well people can tell apart AI-generated text from human-written content. Below is a summary of some revealing studies.

General Content Evaluations by Non-Experts

Can the general public tell the difference between AI and human-generated content?  The Crowdworker Study (Clark et al., 2021) at the University of Washington tested Amazon Mechanical Turk workers on distinguishing text from GPT-2 and GPT-3 models versus human text, in reading stories, news articles, and recipes.  The study included 780 participants evaluated 3,900 text samples (50 human and 50 AI texts per domain)​.

People performed about as well as random guessing for GPT-3 outputs (~49.9% correct, ~50% is chance). GPT-2 outputs were only slightly easier (57.9% correct identification)​.  This study indicated that for the most part, people cannot not reliably tell AI text from human text. Participants often cited grammar or style cues, but their reasons were inconsistent and sometimes contradictory​.

Nonetheless, as people become increasingly familiar with AI text and its style, is there a growing awareness of what is AI generated versus what is not?  Generally the answer is “no”.  A Penn State (PIKE Lab) experiment in 2024 (Dongwon Lee et al., 2024) showed a similar difficulty among readers in a controlled experiment. Prof. Lee noted that people could correctly distinguish AI-generated text only about 53% of the time.  This result aligns with the above study.  People’s ability to identify AI-written content is roughly equivalent to flipping a coin.

But what about text generated through interactive dialog?  Is the AI chat similar to normal human conversations?  The Conversation Turing Test (Jones & Bergen, 2024) aimed to see whether or not humans can tell if they’re chatting with a bot or a person in real-time.  Five hundred participants were split into roles for a 5-minute text-chat “Imitation Game.” Most were interrogators who conversed via a messaging interface with an unknown partner and then had to judge if it was human or AI; a smaller group of participants acted as human “witnesses” trying to appear human​.  The AI partners included a simple chatbot (ELIZA), GPT-3.5, or GPT-4 (all interlocutors were assigned randomly)​.

The results were similar.  The study showed that GPT-4’s conversations were often indistinguishable from a human. Participants judged the GPT-4 model to be human 54% of the time​.  Interestingly, the older ELIZA chatbot was correctly flagged as AI in most cases (only 22% of the time did people think ELIZA was human)​.  When it came to actual human partners, people were correct in 67% of chats (one-third of the time a human was wrongly thought to be a bot)​

What about applications in academia?  A study regarding Student Essays Detection (Waltzer et al., 2024) examined whether educators and students can identify AI-generated academic writing. One hundred forty college instructors (professors, lecturers) and 145 college students were each given an AI Identification Test as an online survey.  The test presented pairs of short college exam essays on the same prompt – one essay written by an actual student under exam conditions, the other generated by ChatGPT – and asked the participant to decide which one was AI vs human​

Again, the task proved challenging. On average, instructors identified the ChatGPT-written essay only 70% of the time, while students were correct only 60% of the time​.  In this study, the researchers took the challenge a step further, asking ChatGPT itself to identify human versus AI content – the AI chatbot achieved about 63% accuracy​.  Also noted in the study, having more expertise did not guarantee success: instructors did perform better than students overall, but the study found that neither greater experience with ChatGPT nor subject-matter expertise (e.g. in the essay topic) significantly improved an individual’s performance​.

So then, what about situations where there is very specific vocabulary at play?  For example, can medical doctors tell the difference?  In a study, Scientific Abstracts Peer-Review (Stadler et al., 2024) researchers focused on a specialized domain: medical research abstracts in orthopedics. It asked whether expert reviewers in the field could spot AI-generated abstracts. A confidence scale (Likert) and reasoning for decisions were also collected as part of the survey.

Even these domain experts had difficulty. Reviewers correctly identified only 62% of the AI-generated abstracts on average​, while misclassifying 38% of genuine human-written abstracts as being AI​.  In this case, when reviewers did catch an AI-generated abstract, it was often due to obvious errors or impossibilities in the content (e.g. inconsistent data) or other tell-tale oddities, whereas cases where real abstracts were flagged as AI were frequently due to the writing style being mistaken as machine-like​.

Factors Influencing Identification if AI vs Human Content

If standard research methods are unable to show a significant difference between human-generated and AI-generated content, are there any meaningful clues that readers can use to discern the difference?

Across these and other similar studies, several factors can be noted.

Writing Style and Linguistic Cues: Participants commonly rely on style, grammar, or “feel” of the text to make their judgments. For instance, crowdworkers in one study pointed to things like unnatural phrasing or repetitiveness – yet these cues were not consistently reliable​.  Writing style does influence human perception, but AI models have become adept at mimicking human-like style, reducing the effectiveness of this cue.

Subject Matter and Content Complexity: The topic or genre of text can impact detectability. A Tooltester survey showed people were more fooled by certain topics, such as health content which also had a high rate of confusion among readers.

User Expertise and Background: One might expect that experts (like teachers or domain professionals) or heavy tech users would have an edge in spotting AI text. The evidence is mixed. In the college essay study, faculty (instructors) performed slightly better than students (70% vs 60%)​. Interestingly, familiarity with AI tools also shows only a minor effect: the Tooltester survey noted users familiar with ChatGPT only achieved 48% accuracy vs 40% for those unfamiliar​

Demographics (Age Groups): Age appears to be a factor in at least one study. The interactive chat experiment found a negative correlation with age – perhaps unsurprisingly, younger participants were better at correctly identifying AI partners than older participants​

So is it possible to learn to spot AI vs human-generated text?  As studies show, training humans with examples or guidelines yields only modest improvements. Clark et al. tried brief training interventions (such as showing annotators labeled examples or giving tips on known AI tells) and saw accuracy rise a few percentage points, but still hover around similar results as untrained participants.

Then, if humans struggle to identify the difference between human-generated content and AI-generated content, why can AI do so much better?  Dongwon Lee’s lab (as referenced above) developed an AI classifier that can identify AI vs human text with 85–95% accuracy, far exceeding human success rates.  This result shows us that while people are about as good at writing as machines, collectively, we are being outwitted by our own machines in this one aspect.  Why that is, is uncertain.  Perhaps for computers, “it takes one to know one”.

The Value of People

If there is little difference between in recognition of AI-generated content versus human-generated content by humans themselves, what value can humans – “people” – bring to the table?

While AI can process vast amounts of data, generate insights, and even draft reports, we believe that it lacks the nuanced understanding, contextual awareness, and empathy that define high-quality consulting. Clients seek assurance that their challenges are being considered with real human thought, not just addressed through a copy-paste response from a generative AI prompt.

Understanding Beyond Data

One of the greatest strengths of human consultants and business analysts is their ability to interpret data within the context of a company’s unique culture, industry, and specific challenges. AI can analyze trends and patterns, but it lacks the ability to discern the political dynamics within an organization, the emotions tied to certain business decisions, or the unspoken concerns of stakeholders. A well-crafted, customized solution requires a level of critical thinking that goes beyond algorithms—it requires human intuition and experience.

Building Trust Through Personalization

Clients need to feel heard, valued, and understood. Consulting is not just about providing answers; it’s about fostering relationships. When a consultant actively listens, asks clarifying questions, and tailors solutions to a company’s unique situation, they create trust and credibility. An AI-generated response may sound polished, but it often lacks the depth and contextual applicability that come from human engagement. A consultant who takes the time to craft thoughtful recommendations demonstrates their personal and professional investment in the client’s success.

Adaptability and Strategic Thinking

AI operates within the parameters of its training data and programmed logic. Human consultants, on the other hand, excel in navigating ambiguity, shifting priorities, and unforeseen challenges. While the process for building up to this analysis may have some imperfections which are inherent to real people, the achievement of the completed analysis brings a sense of accomplishment both to the consulting team as well as the client.

To be sure, business environments are complex, with variables that can’t always be predicted. A “real-life” consultant brings the ability to pivot strategies on the go, draw from past experiences, and think critically about evolving circumstances through original imagination in ways that AI simply cannot do.

Ethical and Emotional Considerations

Business decisions are not always based solely on data; they often involve ethical dilemmas, competing priorities, and emotional factors. A company undertaking a restructuring exercise, for example, needs a consultant who can balance financial objectives with employee morale and company culture. AI might suggest the most cost-effective approach, but a human consultant can weigh the ethical implications and offer a compassionate perspective.

The Competitive Advantage of Human Insight

Ultimately, the value of consulting and business analysis lies in human expertise, judgment, and creativity. AI helps to enhance efficiency and provides valuable support, especially in early stage ideation and brainstorming of potential solutions.  But it is up to real people to prioritize the human touch in consulting, and in so doing, to stand out by offering clients something AI cannot—a truly customized, thoughtful, and relationship-driven approach.

As technology continues to evolve, the most successful consultants and analysts will be those who leverage AI as an assistant while ensuring that their insights remain deeply human. The key to long-term success in consulting is not just delivering information, but demonstrating understanding, foresight, and a genuine commitment to the client’s needs.

Final Thoughts

To be sure, with the content of this article in mind, the readers have been diligently asking themselves whether or not this article was written using generative AI.

AI requires humans.  The storyline, the framing of the issues, the generation of thought required to piece together different points of interest… all these are demanded of the human communications experience, and in turn, require human input into their development.

“Who wrote what” is a more nuanced question.

At the same time, merely asking the question, to the insightful observer, then begs another:  does it matter?

That is a topic for another day.