Words, meanings, and contexts

10/25/2023

aWidespread availability of and interest in AI models (e.g., ChatGPT) has not surprisingly motivated more researchers and/or people who need to demonstrate research activity to try to figure out new ways to report findings based on minimal direct engagement with data.
During the Golden Age of QDAS (Qualitative Data Analysis Software) emergence - by which I mean some point in the 2010s when NVivo, Atlas.ti, MAXQDA, and to a lesser extent the pioneer in web-based QDAS, Dedoose, were becoming known and more widely used, it seemed that a lot of people around me were particularly excited about the potential to use a qualitative analog of statistical software programs like SPSS or SAS. By this, I mean a mechanism for entering or directing toward your data, clicking some boxes and obtaining comprehensive output that showed the final analysis.
Of course the QDAS programs did not quite work like that and were instead database builders were text was selected instead of typed, and categories could be created as you went, rather than needing to create the organizational structure beforehand. This to me was still great progress, especially in these programs' ability to run queries and hyperlink to context of each coded excerpt. I have coded in Word a lot, and still use Word. But it requires work arounds to see context and is not ideal for people who do not have great typing skills. And although I did not mention Quirkos above - because it came along a little later than the others, it is my go to QDAS program these days.

These programs have all become more sophisticated over time. However, even the so-called auto coding (or theory building, available in HyperRESEARCH) functions are mostly researcher initiated, and require some prompts, and cannot interpret context although the program may identify patterns, typically based on frequency.
Because, in a way, you could think of most analysis software - qualitative or quantitative, as a sort of calculator. The same thing applies to the things people used to talk about, like machine learning algorithms, before "artificial intelligence," which is admittedly a much sexier term, became so popular.
As an aside, I wonder how many people visualize "artificial intelligence" as female? I'm thinking about many references through time including the original "Star Trek" computer voice, Siri and Alexa, OnStar, the 1970s "Stepford Wives," the 1929 Metropolis movie poster, Rosie from "The Jetsons," and Scarlett Johansson in "Her." Of course there are some discomforting cases, the "Lost in Space" robot, and Kit, the car from "Knight Rider."
I am personally curious right now how often already ChatGPT and competitors have been asked to analyze qualitative data.

I also wonder how people phrase this request and how it might be phrased. And, as another aside, this has been so far the most engaging thing I have read about interacting with AI:https://www.theguardian.com/books/2023/sep/05/proust-chatgpt-and-the-case-of-the-forgotten-quote-elif-batuman
As another aside, a scholar I spoke with a few weeks ago noted that one of the challenges with AI is that you really need to know about what you are asking about, in order to ask the right question. This sounds a lot like what led to the answer being "42" (from Adams's The Hitchhiker's Guide to the Galaxy....)
Through a few other conversations about both machine learning and AI, and my own explorations of content analysis, text management, and related issues, I have concluded for the most part frequency drives results (this feels like my own frequency-based conclusion, of course). Not surprisingly this underlies most quantitative analysis, since the conventional training is frequentist as opposed to Bayesian. Frequency as the main, if not the only criterion, also drives a great deal of qualitative or non-structured data analysis I have encountered.
And, by "non-structured," I mean text responses that may not be analyzed in a qualitative way - for example, counting word occurrences instead of trying to figure out what people think.
In one source I just read, the author/ programmer/researchers reduced all words to their root or stem form.
I felt this was not a great choice, and I identified some reasons why I feel this is problematic. These have to do with the word itself and variable meanings ascribed to words and context - which for a behavioral researcher is everything.
One of my best example words was f#ck - which has multiple applications that often have nothing to do with the literal meaning, but instead I'll use something a little more innocuous to explore in detail.
Imagine you have a verbatim text transcript that includes verbal mannerisms - like, you know, etc. Next imagine the interview is about preferences. Now consider briefly what a calculator is going to find when it counts instances of "like," from a person who uses this both to express preference and as a mannerism. If you reduce to the root word, then "likely" or probable, may also get lumped into the "like" count.
If you do not look at key words in context (this is a great resource for that type of analysis), then you have missed the opportunity to distinguish among "do not like" and "like very much" and "used to like but not anymore."
You can for certain program or prompt whatever calculator/AI/algorithm you are using to look around the word/words of interest (or use AntConc linked above) but one issue with this has to do with the non-linear, rambling way some people speak. You may capture the proximate context and fail to see, for example, that the speaker was quoting someone else: "My partner told me 'I like this very much,' and I cannot figure out why because I don't."
If your aim is something like content analysis of unstructured data - for example you want to identify the most frequently used words and get some basic idea of context - calculator-type programs are probably fine. But if you really want to know what people said about something, including how they said it, and any values they attach to it, there is no substitute for close reading and meaning unit type open coding. What a lot of this seems to come down to is people not wanting to spend time, or not perceiving they have time, or being intimidated by the idea of conducting in depth qualitative analysis. I do my part to help with the intimidation barrier - by teaching qualitative courses, helping people who ask me for help - including those in my institution and those in other organizations - but as far as the time-related barriers to, I tend to think if you do not have time to deal with your data, you should take that into account before you ask people to contribute it.

1 Comment

software defined vehicles link

3/6/2024 03:21:33 am

Software-defined vehicles represent a revolutionary approach to automotive design, where software controls critical vehicle functions traditionally managed by hardware. This innovative concept enables flexible customization, over-the-air updates, and adaptive features, enhancing vehicle performance, safety, and user experience. By leveraging advanced computing and connectivity, software-defined vehicles pave the way for autonomous driving and smart mobility solutions, driving innovation in the automotive industry.

Words, meanings, and contexts

Leave a Reply.

Author

Archives

Categories