Code Coverage
 
Lines
Functions and Methods
Classes and Traits
Total
80.00% covered (warning)
80.00%
48 / 60
50.00% covered (danger)
50.00%
3 / 6
CRAP
0.00% covered (danger)
0.00%
0 / 1
CompanyResearchService
80.00% covered (warning)
80.00%
48 / 60
50.00% covered (danger)
50.00%
3 / 6
13.15
0.00% covered (danger)
0.00%
0 / 1
 __construct
100.00% covered (success)
100.00%
1 / 1
100.00% covered (success)
100.00%
1 / 1
1
 getOrCreateForUrl
94.74% covered (success)
94.74%
18 / 19
0.00% covered (danger)
0.00%
0 / 1
6.01
 runResearch
72.22% covered (warning)
72.22%
26 / 36
0.00% covered (danger)
0.00%
0 / 1
2.09
 getTtlDays
100.00% covered (success)
100.00%
2 / 2
100.00% covered (success)
100.00%
1 / 1
1
 isFresh
100.00% covered (success)
100.00%
1 / 1
100.00% covered (success)
100.00%
1 / 1
1
 getByDomain
0.00% covered (danger)
0.00%
0 / 1
0.00% covered (danger)
0.00%
0 / 1
2
1<?php
2
3namespace App\Http\Services\RolePlay;
4
5use App\Http\Models\CompanyResearch;
6use App\Http\Models\Parameter;
7use App\Http\Services\NodeJsAIBridgeService;
8use App\Http\Services\WebScraperService;
9use App\Jobs\RefreshCompanyResearchJob;
10use Illuminate\Support\Facades\Log;
11
12/**
13 * Manages the lifecycle of cached company research records.
14 *
15 * Coordinates website scraping (via {@see WebScraperService}) and deep
16 * AI research (via {@see NodeJsAIBridgeService} with Google Search
17 * grounding) to build comprehensive company intelligence used by the
18 * roleplay auto-populate flow.
19 *
20 * Records are keyed by normalized domain and cached for a configurable
21 * TTL (default 30 days, controlled by the `company_research_ttl_days`
22 * system parameter).
23 */
24class CompanyResearchService
25{
26    /**
27     * The research prompt template.
28     *
29     * The `{domain}` placeholder is replaced at runtime with the actual domain.
30     */
31    private const RESEARCH_PROMPT = <<<'PROMPT'
32You are a business intelligence analyst. Conduct an exhaustive deep-dive research on the company at {domain}.
33
34Search broadly: their main website, /about, /about-us, /products, /services, /solutions, /pricing, /team, /careers, /blog, /press, /news, /customers, /case-studies, and any public third-party sources (press releases, industry articles, LinkedIn, Crunchbase, Glassdoor, G2, etc.).
35
36Your goal is to extract and document the MAXIMUM amount of factual information. Do NOT summarize or condense. Write extensively — include every detail you find. This data will be stored as a reference database and used later for targeted content generation. More detail is always better.
37
38Cover ALL of the following sections in depth:
39
40### 1. Company Overview
41- Full legal/operating name, any DBA names or brand names
42- Detailed description of what the company does (multiple paragraphs)
43- Mission statement, vision, and core values (quote verbatim if found)
44- Founding year, founders' names and backgrounds
45- Headquarters location (full address if available)
46- Other office locations
47- Company size: number of employees, revenue (if public)
48- Ownership structure (private, public, employee-owned, PE-backed, etc.)
49- Brief company history and major milestones
50
51### 2. Products & Services
52- List EVERY product and service offering individually
53- For each product: name, detailed description of what it does, who it's for, key capabilities
54- Product categories or product lines
55- Any free trials, freemium tiers, or demo offerings
56- Platform details (web app, mobile app, browser extension, API, etc.)
57- Any white-label or reseller programs
58
59### 3. Target Market
60- ALL industries and verticals they serve — list every one mentioned on their site
61- Specific market segments or niches they specialize in
62- Company sizes they target (startups, SMB, mid-market, enterprise)
63- Geographic markets served (local, national, international — which countries/regions)
64- Target buyer personas: job titles, departments, seniority levels
65- Any specific use cases or scenarios they highlight
66
67### 4. Value Propositions & Differentiators
68- Every competitive advantage and unique selling point
69- Specific claims they make (e.g., "saves X hours", "increases Y by Z%")
70- Customer testimonials or success metrics mentioned on their site
71- Awards, certifications, recognitions
72- How they position themselves against competitors
73
74### 5. Pricing Model
75- Every pricing tier, plan name, and price point (if publicly available)
76- What's included in each tier
77- Free vs. paid feature differences
78- Enterprise/custom pricing indicators
79- Any volume discounts, annual vs. monthly pricing
80
81### 6. Clients, Partners & Case Studies
82- Every named client, customer, or logo featured on their site
83- Partner ecosystem: technology partners, resellers, integrations partners
84- Case studies: company name, industry, problem solved, results achieved
85- Testimonials: quote, person's name and title, company
86
87### 7. Leadership & Team
88- Founders and C-suite executives: names, titles, backgrounds
89- Board members or advisors if listed
90- Total team size and any notable team information
91- DEI initiatives, culture programs
92
93### 8. Company Culture & Employer Brand
94- Core values and cultural principles
95- Employee benefits and perks highlighted
96- Glassdoor ratings and notable employee feedback themes
97- Charitable initiatives, CSR programs
98
99### 9. Recent News & Developments
100- Product launches, feature releases (last 2 years)
101- Funding rounds, acquisitions, mergers
102- Strategic partnerships announced
103- Media coverage, press mentions
104- Blog topics and thought leadership themes
105
106### 10. Technology & Integrations
107- Technology stack and platforms
108- Every integration listed (CRM, email, social, productivity tools)
109- API availability and developer resources
110- Security certifications, compliance (SOC2, GDPR, HIPAA, etc.)
111- Mobile/desktop/browser support
112
113### 11. Pain Points They Address
114- Every problem or challenge they claim to solve
115- Before/after scenarios described on their site
116- Industry-specific pain points they target
117- Objections they preemptively address in their marketing
118
119### 12. Competitive Landscape
120- Named competitors or "alternatives to" mentioned anywhere
121- How they differentiate from competitors
122- Market category or segment they compete in
123
124IMPORTANT RULES:
125- Write extensively. Include ALL details found — do not leave out information to save space.
126- Quote specific numbers, metrics, and claims verbatim when found.
127- If a section has limited information available, state what you found and note the limitation.
128- Use bullet points and sub-bullets for maximum detail density.
129- Do not fabricate information. Only include facts you can verify from public sources.
130- Target 3000+ words of output. Longer is better.
131PROMPT;
132
133    public function __construct(
134        private readonly WebScraperService $scraper,
135        private readonly NodeJsAIBridgeService $aiService,
136    ) {}
137
138    /**
139     * Return an existing (fresh) research record or create/refresh one.
140     *
141     * - Fresh + completed => return immediately.
142     * - Stale + completed => dispatch async refresh, return stale record.
143     * - Processing        => return as-is (caller polls).
144     * - Not found         => create + run synchronously, return record.
145     *
146     * @param  string  $url  The website URL to research
147     * @param  mixed  $user  The authenticated user (for logging / requested_by)
148     */
149    public function getOrCreateForUrl(string $url, mixed $user): CompanyResearch
150    {
151        $domain = CompanyResearch::extractDomain($url);
152        $existing = CompanyResearch::where('domain', $domain)->first();
153
154        if ($existing) {
155            if ($existing->status === 'completed' && $this->isFresh($existing)) {
156                return $existing;
157            }
158
159            if ($existing->status === 'completed' && ! $this->isFresh($existing)) {
160                RefreshCompanyResearchJob::dispatch($existing);
161
162                return $existing;
163            }
164
165            // processing, pending, or failed — return as-is
166            return $existing;
167        }
168
169        // New domain — create and run synchronously
170        $research = CompanyResearch::create([
171            'domain' => $domain,
172            'url' => $url,
173            'status' => 'processing',
174            'requested_by' => (string) ($user->id ?? ''),
175            'research' => '',
176            'scraped_content' => '',
177        ]);
178
179        $this->runResearch($research);
180
181        return $research->fresh();
182    }
183
184    /**
185     * Execute the full research pipeline: scrape + AI deep research.
186     *
187     * On success, updates the record to `completed` with the research output.
188     * On failure, marks the record as `failed` with the error message and
189     * rethrows the exception so callers/jobs can handle retries.
190     *
191     * @param  CompanyResearch  $research  The research record to populate
192     *
193     * @throws \Exception On any scraping or AI failure
194     */
195    public function runResearch(CompanyResearch $research): void
196    {
197        try {
198            $scrapedContent = $this->scraper->scrape($research->url);
199
200            $prompt = str_replace('{domain}', $research->domain, self::RESEARCH_PROMPT);
201
202            $result = $this->aiService->generate(
203                [
204                    'provider' => 'vertex',
205                    'model' => 'gemini-3.1-flash-lite-preview',
206                    'prompt' => $prompt,
207                    'config' => [
208                        'maxOutputTokens' => 16000,
209                        'temperature' => 0.7,
210                        'topP' => 0.95,
211                        'thinkingBudget' => 4096,
212                        'enableGoogleSearch' => true,
213                    ],
214                ],
215                [
216                    'feature' => 'company_research',
217                    'user_id' => $research->requested_by,
218                ]
219            );
220
221            $research->update([
222                'status' => 'completed',
223                'research' => $result,
224                'scraped_content' => $scrapedContent,
225                'error' => null,
226            ]);
227        } catch (\Exception $e) {
228            Log::error('[CompanyResearchService] Research failed', [
229                'domain' => $research->domain,
230                'error' => $e->getMessage(),
231            ]);
232
233            $research->update([
234                'status' => 'failed',
235                'error' => $e->getMessage(),
236            ]);
237
238            throw $e;
239        }
240    }
241
242    /**
243     * Read the TTL (in days) from the system parameter, with a 30-day default.
244     *
245     * @return int Number of days a research record is considered fresh
246     */
247    public function getTtlDays(): int
248    {
249        $param = Parameter::where('name', 'company_research_ttl_days')->first();
250
251        return (int) ($param?->value ?? 30);
252    }
253
254    /**
255     * Check whether a research record is still within the TTL window.
256     *
257     * @return bool True if the record's updated_at is within TTL days of now
258     */
259    public function isFresh(CompanyResearch $research): bool
260    {
261        return $research->updated_at->diffInDays(now()) < $this->getTtlDays();
262    }
263
264    /**
265     * Look up a research record by normalized domain.
266     *
267     * @param  string  $domain  The normalized domain to search for
268     */
269    public function getByDomain(string $domain): ?CompanyResearch
270    {
271        return CompanyResearch::where('domain', $domain)->first();
272    }
273}