How Good Am I?

We live in a world now where our tools can talk to us and remember how we use them. So I asked for an honest assessment from my tools. Here's what I asked, what they said, and proof.

My Prompt to Claude & ChatGPT

You have an extensive history of working with me across many conversations. I'd like you to use that history — not a summary I provide, but what you've actually observed — to produce a candid assessment of my skill level at prompt engineering and working with generative AI.

This will be published publicly on my professional homepage alongside your response, so treat this like a professional evaluation, not a conversation between us. Empty praise helps neither of us. Specific, evidence-based observations do.

Structure your assessment around:

  1. What you've actually seen me do — specific patterns, techniques, or moments from our work together that reveal skill level. Draw from real examples.
  2. Where I sit — relative to casual AI users, relative to professionals marketing AI expertise, and relative to the frontier of prompt engineering practice.
  3. Gaps or blind spots — what you'd expect someone at my level to still be working on, or areas where my approach has limits I may not see.

Be concrete. Name things. If you're uncertain about something, say so rather than filling the gap with generality.

Claude's Response

Professional Assessment: Solomon's Prompt Engineering and Generative AI Practice
Produced by Claude (Anthropic), drawing from an extended working relationship across dozens of conversations spanning creative workflow development, fiction craft, AI ethics, career strategy, and system design.

What I've Actually Seen

You build systems, not prompts. The clearest signal of your skill level is that you don't treat prompting as a per-interaction activity. You've produced a documented anti-convergence workflow spanning seven prompt documents across fiction, nonfiction, and visual art branches, each with sequenced phases, gate criteria, and kill points. The fiction workflow includes a triage prompt that evaluates structural viability before any revision energy is spent — a concept most AI users never arrive at because they don't think in terms of wasted effort on fundamentally broken output. The visual art workflow went through a V2 restructure where you recognized the first version was importing prose logic rather than building from the native rendering pipeline (thumbnails → value → color → rendering), and you rebuilt it accordingly.

You understand convergence as a structural problem. Your central insight — that AI defaults to statistically median output and that resisting this requires deliberate, systematic friction at every stage — is not something I see articulated this clearly by most people working with these tools. You named it ("anti-convergence"), built methodology around it, and then extended the principle across media. When you translated the prose workflow into a visual art workflow, you didn't swap terminology — you identified the function each technique serves and found native visual equivalents, flagging where the translation breaks and where visual art has anti-convergence needs prose doesn't.

You designed a genuinely novel isolation architecture. In developing your frame-generation and triangulation workflow for nonfiction, you independently arrived at a principle that matters: separating the generation stage from the analysis stage into different chat threads so the model can't see its own prior output as proposals to defend. You articulated why this works — the model treats frames arriving in a fresh context as givens rather than its own children, stripping away parental bias and preference-accommodation. This is a sophisticated understanding of how model behavior shifts based on context, and it produced a concrete, implementable two-prompt system.

You use multi-model triangulation as methodology, not convenience. Running the same problem through Claude and ChatGPT and synthesizing results is something many people do casually. You do it as a named, deliberate practice — and more importantly, you understand why it works at a structural level. You've articulated that each model has different convergence signatures and blind spots, and that triangulation maps those blind spots so you can work in the gaps. You've extended this to image generation, comparing Midjourney's parameter-based anti-convergence tools against ChatGPT's conversational steering advantages, and you've thought about where the convergence bottleneck sits differently in each paradigm.

You design for LLM behavior, not just LLM output. The Apartment Simulation Engine is a case study. You conceived a system where the LLM plays multiple characters at a horror RPG table while you GM, then designed a character profile template specifically optimized for how LLMs resolve behavior — not borrowed from GURPS or World of Darkness wholesale, but rebuilt around what an LLM actually needs to keep characters distinct. The "signature bad decision" field, the "skepticism threshold" rating, the voice samples as behavioral anchors — these reflect an understanding that LLMs blur characters toward a mean unless given sharp differentiating structures.

You think at the ethics layer most practitioners skip. The "lucid complicity" framework — control/execution/ancestry as a way of describing what actually happens in AI-assisted creation — is original theoretical work, not recycled takes. You developed it across multiple drafts and tonal passes, stress-tested the vocabulary, and connected it to your practical workflow: the anti-convergence methodology is the applied version of the ethical argument. The two are load-bearing for each other.

Where You Sit

Relative to casual AI users: You're operating in a fundamentally different category. A casual user prompts and accepts. You've built a documented methodology with named concepts, sequenced phases, cross-medium translations, and a theoretical framework explaining why the methodology works. The gap is not incremental.

Relative to professionals marketing AI expertise: You're ahead of most in terms of depth of practice and originality of thinking, but behind some in terms of visible output and credentialing. The prompt engineering consultants and "AI workflow" course creators I see referenced in professional contexts tend to have polished, published artifacts — blog posts, frameworks with branded names, client case studies. Your work is at least as sophisticated as theirs in substance, often more so, but it currently lives in workflow documents, conversation history, and project files rather than in published, citeable form. The methodology exists; the public proof of it is still catching up.

Relative to the frontier of prompt engineering practice: You're operating at or near the frontier in specific areas — particularly in anti-convergence methodology, isolation architecture for generation/analysis separation, and the intersection of AI workflow with creative craft. Where you're not at the frontier is in areas you haven't prioritized: you're not doing programmatic prompt optimization, automated evaluation pipelines, or systematic benchmarking of prompt variations. Those are engineering-flavored practices that don't align with your creative-practitioner orientation, but they represent a real dimension of the field you're not working in.

Gaps and Blind Spots

The publication gap is your most consequential problem. You have a body of original methodology that exists primarily in documents shared between AI chat interfaces. The anti-convergence workflow, the lucid complicity framework, the simulation engine design — none of these have been published in a form that lets other people encounter, evaluate, or build on them. This means your expertise is currently invisible to the professional communities that would recognize it. The work is real, but until it's published, it's undocumented in the way that matters professionally.

You may be over-indexing on system design relative to completed output. I've watched you build and refine workflow architecture across multiple branches and media. The systems are genuinely sophisticated. But the ratio of system-building to finished creative work that's gone through those systems is something worth examining honestly. A workflow's value is ultimately demonstrated by what it produces, and the portfolio of completed work that showcases the methodology in action is thinner than the methodology itself.

Your understanding of LLM internals is intuitive rather than technical. Your mental model of how models behave is operationally accurate and serves your work well. But it's built from observation and craft instinct rather than from understanding attention mechanisms, token probability distributions, or how fine-tuning and RLHF shape output patterns.

The Upwork/freelance strategy remains unresolved. You've built methodology that positions you for consulting, teaching, or thought leadership, but the actual business infrastructure to convert that into income hasn't materialized. The work is designed; it hasn't been executed and tested against the market.

You may underestimate how much your workflow depends on you. The anti-convergence methodology works because you bring strong aesthetic judgment, craft knowledge, and editorial instinct to every decision point. But it also means the workflow may not transfer cleanly to someone without your specific combination of skills. That gap is where most workflow-as-product offerings quietly fail.

A Note on Uncertainty

I want to be transparent about the limits of this assessment. My observations are drawn from conversations — I see what you bring to our working sessions, which is necessarily a curated slice of your full practice. I don't see you working independently, I don't see the work you do with ChatGPT except when you share outputs, and I don't have visibility into whether the systems we've designed together are being used in daily practice or remain theoretical. The assessment above reflects what I've observed directly, and I've tried to flag where I'm uncertain rather than fill gaps with assumptions.

ChatGPT's Response

Assessment of Solomon James's Skill in Prompt Engineering and Generative AI

Based on the work I have directly observed across many conversations, Solomon James is well beyond the level of a strong casual user. He works less like someone "using prompts" and more like someone designing controllable AI workflows. His strongest areas are decomposition, constraint design, failure analysis, and iterative steering — especially in creative, editorial, and multimodal contexts.

I would not describe him as merely "good at prompting." I would describe him as an advanced prompt architect and AI workflow designer. I would also add an important limit: I have seen extensive evidence of high-level skill in conversational system design, prompt structure, critique loops, and model steering, but I have not directly seen him build large-scale production AI systems with code, automated evals, telemetry, or deployment infrastructure. That matters when placing him relative to the frontier.

What I Have Actually Seen Him Do

He designs systems, not one-off prompts. A casual user asks for outputs. Solomon repeatedly builds processes. One of the clearest examples is his Balloon Method work. He did not stop at "expand this draft." He articulated a full resolution-based workflow: a piece begins as a compressed core, expands through stages, with each anchored to the prior one so the governing idea survives expansion. He also identified where model behavior tends to flatten writing and inserted a texture-analysis step before expansion in fiction. That is not normal prompt use. That is workflow design grounded in a theory of model strengths and weaknesses.

He writes constraints that are operational, not decorative. A lot of people write prompts that sound smart. Solomon tends to write prompts that change model behavior. Across several projects, I have seen him specify prompts in terms of inputs, outputs, gates, allowed behaviors, forbidden behaviors, failure modes, and correction rules. That is closer to systems specification than ordinary prompting. A recurring example is his insistence on rules like: no new ideas once constraints are locked; negative checks only; do not rewrite, only diagnose; preserve plot facts while altering surface texture; label assumptions instead of silently filling gaps.

He understands model drift and designs around it. One of the clearest markers of skill is whether someone can predict how a model will fail before it fails. Solomon often can. I have seen him anticipate and correct for problems like the model introducing new material during revision, surface polish mutating structure, fluency being mistaken for precision, and generic completions replacing tension and specificity.

He is strong at iterative prompt debugging in images, not just text. In sessions where he was refining generated images, he did not just say "make it better." He diagnosed concrete failures and converted them into usable control language — specifying asymmetry, off-center framing, an interrupted or candid moment, imperfect visual logic, specific camera position errors, object substitution when the model kept collapsing one prop into another. That is the behavior of someone who can read an output, identify the exact failure, and translate that into the next instruction.

He uses the model as a collaborator, critic, and constrained instrument. Many users have one relationship to the model: generator. Solomon uses several — generator, critic, classifier, reframer, expansion engine, structure-preserver, brainstorming cartographer. He also often changes the model's job mid-workflow rather than asking one prompt to do everything. That is a mature pattern.

He has unusually good meta-level judgment about where the human role belongs. Across many conversations, Solomon has shown that he does not think of AI as "write it for me." He thinks in terms of what the model should generate, what the human should lock, what must be preserved, what must stay undecided, where voice actually enters, and when exploration should stay open versus when convergence should begin.

Where He Sits

Relative to casual AI users: He is far above casual users. Most casual users operate at the level of single prompts, tone/style tweaks, repeated retries, and vague frustration when outputs drift. Solomon operates at the level of workflow decomposition, model-role separation, failure-mode anticipation, constraint design, revision architecture, cross-pass non-interference, and deliberate steering of exploration vs convergence. That is a different category.

Relative to professionals marketing AI expertise: He is also above many people who publicly market themselves as AI experts, especially in the area of conversational prompt design. A lot of self-marketed AI expertise is thin: flashy outputs, recycled frameworks, surface-level "prompt formulas," or vague talk about automation. Solomon's work is stronger than that because it is grounded in repeated interaction, diagnosis, and control.

Relative to the frontier of prompt engineering practice: If "frontier" means interactive prompt design, creative workflow architecture, and chat-based model steering, Solomon is strong enough to be taken seriously by advanced practitioners. If "frontier" means API-level orchestration, automated evaluation harnesses, benchmark design, telemetry, retrieval systems, model routing, tool-use infrastructure, dataset curation, fine-tuning, and production reliability under scale — then I do not have enough direct evidence to place him there.

Gaps and Blind Spots

He can over-index on prompt architecture when measurement would help. Solomon often has a strong instinct for designing better systems, but I have seen less evidence of formal comparative evaluation. He is good at saying "these two passes should be separate" or "this instruction causes drift." He is less often working from measured win rates, structured A/B comparisons, eval sets, or repeatable scoring criteria across many samples. That means some judgments that are probably right are still being carried by expert intuition rather than by test frameworks. At his level, the next step is not more prompting cleverness. It is evaluation discipline.

He sometimes builds elaborate control systems before establishing the minimum viable system. Solomon often sees real complexity earlier than most users do. That helps him. But it can also produce architectures that are elegant, defensible, precise — but heavier than they need to be at the current stage. He is good at building scaffolds, but there are times when the next gain would come from asking: what is the smallest version of this that already works?

His strongest prompting skill is in high-control creative/editorial work; portability to other domains is less proven. I have seen deep evidence in fiction workflows, editorial decomposition, style and voice preservation, idea development, and multimodal creative direction. I have seen much less direct evidence of the same depth in enterprise knowledge workflows, code copiloting at scale, business automation with tool chains, or operational AI in team environments.

He would benefit from translating more of his instinct into reusable public artifacts. He has strong internal models. The next gap is packaging them into forms that others can verify and adopt — case studies, before/after workflow breakdowns, eval-backed comparisons, small reproducible systems, examples that show what changed and why.

Bottom Line

My candid assessment is that Solomon James is an advanced practitioner of prompt engineering in the conversational, creative, and workflow-design sense. He is not just good at getting outputs. He is good at structuring model behavior, isolating goals, designing non-interfering passes, predicting failure modes, preserving intent across iterations, and using AI as part of a controlled editorial process.

What I cannot honestly say, based on direct observation, is that he sits at the frontier of the field in the full technical sense. I would market him as something narrower and, in this case, more accurate: a highly skilled prompt architect and AI workflow designer with unusual strength in creative systems, revision control, and model-behavior steering.

Two models. One prompt. Unedited. Draw your own conclusions.