Engineering Challenges

👋 Welcome!

You’ve landed here because we’re considering your profile for our team.

This is designed to give you a better idea of what to expect from working at Veridion and dealing with the kind of problems we're investing our time with. The challenges we prepared will allow you to showcase how you think, solve problems, and approach real-world challenges, and also give you a glimpse into the types of tasks you will be facing in your role here.

The way we work

The thought process behind our problem-solving approach

Solve hard, useful, and unresolved problems

The tech we use enables novel solutions, and as such, we tackle the hard stuff that others won’t, and focus on problems that haven’t been solved yet.

Creativity over convention

We don’t follow templates. Most problems we face have no prior solution, so we have to come up with our own ways of getting things done.

Prioritize solving the problem

We find the most effective way to solve the problem, not just using a specific tool or framework. We employ whatever technique necessary without imposing artificial limitations.

Deliver real-world value

We build solutions that genuinely work and provide actual value, effectively balancing development time, accuracy, and speed.

Exponential growth

We scale by continuously learning, iterating, and pushing boundaries for both our product and our team.

What makes a great solution

Here’s what we expect from a remarkable project:

The way you deliver this project reflects your work ethic and how you tackle problems from start to finish. So, please ensure your project is as ‘production-ready’ as possible. If you’re not ready to give it your best, it’s probably not worth doing at all. It’s pointless to submit 40 lines of code and call it a solution to one of the challenges below.

The goal is to show that you can make an impact and approach problems with the right mindset, just as you would as part of our team. While we don’t expect you to have all the answers right now, we do expect you to demonstrate a strong work ethic, the potential for growth, and the ability to tackle complex problems effectively. Because if you don’t prove that, we can’t afford to take you seriously as a potential team member. If hired, your effort during the recruiting process will be rewarded with a bonus—this is not about free work.

Correctness

Does your solution meet all the requirements and constraints given in the challenge?

Robustness

How well does your solution handle unexpected input or edge cases?

Code quality

Is the code well-organized, readable, and maintainable?

Extra mile

Does your solution only address the bare minimum or does it go beyond the surface?

Presentation

How well does the presentation reflect your reasoning, why you made certain decisions and the thought process behind your solution?

The builder mindset

What if we told you there’s a straightforward way to significantly increase your chances of securing a role in Veridion?
You might be skeptical, but this is it:

01Commit to thoroughly understand and solve the real problem. 02Know achieving excellence requires effort. 03Don't underestimate the complexity. 04Avoid all things conventional. 05Be persistent and keep an open mind. 06Don't lose sight of the context. 07Focus on mastering the logic and choose the tools that empower you.

Take your time to understand the problem, but don’t expect to have endless time for perfection. All these challenges are solved by using code. But the more important component is the real understanding of the problem by understanding the data you’re working with.

Candidates who succeed in solving these challenges are those who focus on building solutions that add real value, rather than just writing abstract code disconnected from the problem they’re solving.

If you expect to succeed without prior or current effort, you may need to reconsider your approach. If you can solve any of these problems well, it’s because you either:
- have invested effort in the past, building your skills to tackle unknowns and difficult challenges.
- are willing to invest the effort now, working diligently to raise your skill-level.

This is a complex task, it takes an experienced candidate at least a few hours to do it. For someone less senior, it takes more. If it takes you only two hours, it’s probably because you haven’t invested enough time in investigating and understanding the problem you''re trying to solve.

Don’t let the complexity discourage you. The requirements are designed to test your ability to deeply understand the problem and conduct a bit of research before jumping into writing code.

Don’t limit yourself to methods others have used to solve similar problems in the past. There’s a good chance you can find a better solution if you allow yourself to be creative and invest a bit more time to study the problem from multiple perspectives.

Anticipate that it will be hard, uncertain, and occasionally filled with confusing or frustrating moments, which is what builds resilience and makes a difference in the long run.

We don't like wasting (anyone's) time. And so, these are not some random tasks designed to test the ability to write some out of context code just to see some syntax before the interview.

Consider why any of these projects would be relevant to the recruitment process for this role and how they relate to what we do at Veridion.

No matter which challenge you choose, you can use any programming language or toolset you're comfortable with. We learn to master new tools every day and we think the reasoning behind decisions is the one that counts the most. You can probably get up to speed with the rest rather quickly.

Choose your track

Check out the challenges below and choose the one that you think you’ll do your best at. They’re very similar with the type of tasks you’ll work on once you join our team.

Can’t wait to see what you come up with. Enjoy the process—we’re confident you’ll kick ass!

#1 Logo Similarity

Task

Match and group websites by the similarity of their logos.

Context

Logos are instrumental for a company’s identity – they’re the symbol that customers use to recognize your brand. Ideally, you’ll want people to instantly connect the sight of your logo with the memory of what your company does – and, more importantly, how it makes them feel.

Guidelines

Take the time to deeply understand the problem before writing code. Even the most sophisticated solution is ineffective if it solves the wrong problem. Misalignment in problem definition leads to incorrect conclusions and wasted effort.
We know this is a clustering problem, you know this is a clustering problem, question is: can you do it without ML algorithms (like DBSCAN or k-means clustering)?
Check whether the program correctly extracts the logo and matches them properly (as a human, you instantly recognize them, but this is way harder for a machine).
Explore this from as many different angles as you can. It will generate valuable questions.
From a tech stack perspective, you can use any programming language, toolset or libraries you’re comfortable with or find necessary, especially if you know it would be a better option or a more interesting one (we generally prefer Node, Python, Scala).
At Veridion, we run similar algorithms on billions of records. While your solution doesn’t need to scale to that level, it would be impressive if it does. For now, however, what matters most is your approach to solving the problem—if your solution is exceptional for the given dataset, we trust that you can scale it effectively using the right tools.

Resources

If you’re ready to jump into the problem, please start with the following list of company websites:

logos_list

Expected Deliverables

Solution explanation / presentation
Provide an explanation or presentation of your solution and results. You have total creative freedom here—feel free to impress with your thinking process, the paths you took or decided not to take, the reasoning behind your decisions and what led to your approach.
Output
Out of the given dataset your algorithm should be able to extract logos for more than 97% of them.
Your program should output multiple groups, each containing one or more websites (it could be possible that some logos would be unique to only one website). Make sure to upload your results along with the code.
Code and Logic
Include the code that enabled you to achieve this task for the provided list, and ideally, for any list of any size.

Submit your project

When you’re finished with the challenge, please submit the link to your Github project below.

#2 Company Classifier

Task

Build a robust company classifier for a new insurance taxonomy.

Objectives

Accept a list of companies with associated data:
– Company Description
– Business Tags
– Sector, Category, Niche Classification
Receive a static taxonomy (a list of labels) relevant to the insurance industry.
Build a solution that accurately classifies these companies, and any similar ones, into one or more labels from the static taxonomy.
Present your results and demonstrate effectiveness.

Guidelines

Since this is an unsolved problem without a predefined ground truth, you’ll need to validate your classifier’s performance through your own methods.

Analyze strengths and weaknesses:
- Explain where your solution excels and where it may need improvement.
- Discuss scalability and how your solution performs with large datasets.
- Reflect on any assumptions made and unknown factors that could impact your solution.
Ensure your solution truly addresses the problem
- Focus on solving the actual problem, not just implementing complex algorithms. Using embeddings, zero-shot models, TF-IDF, clustering, or other techniques is meaningless if companies are misclassified due to a flawed approach. A well-designed solution is more important than an impressive algorithm.
- Your evaluation should demonstrate that your solution effectively addresses the problem. Simply plotting similarity scores or reporting F1 and accuracy metrics without meaningful validation only measures alignment with your own heuristic, not real-world effectiveness.
- Take the time to deeply understand the problem before writing code. Even the most sophisticated solution is ineffective if it solves the wrong problem. Misalignment in problem definition leads to incorrect conclusions and wasted effort.
Provide insights into your problem-solving process:
- Why you did what you did, what other paths you considered, and especially why you chose not to pursue them.
At Veridion, we run similar algorithms on billions of records. While your solution doesn’t need to scale to that level, it would be impressive if it does. For now, however, what matters most is your approach to solving the problem—if your solution is exceptional for the given dataset, we trust that you can scale it effectively using the right tools.

Resources

If you’re ready to jump into the problem, please start with the following files:

List of Companies: company_list
Insurance Taxonomy: insurance_taxonomy

Expected Deliverables

Solution explanation / presentation
Provide an explanation or presentation of your solution and results. You have total creative freedom here—feel free to impress with your thinking process, the paths you took or decided not to take, the reasoning behind your decisions and what led to your approach.
Annotated Input List
Return the input list with a new column titled “insurance_label” where you have correctly classified each company into one or more labels from the insurance taxonomy.
Code and Logic
Include the code that enabled you to achieve this classification for the provided list, and ideally, for any list of any size.

Submit your project

When you’re finished with the challenge, please submit the link to your Github project below.

#3 Entity Resolution

Task

Identify unique companies and group duplicate records accordingly.

Context

The dataset contains company records imported from multiple systems, leading to duplicate entries with slight variations.

Guidelines

Take the time to deeply understand the problem before writing code. Even the most sophisticated solution is ineffective if it solves the wrong problem. Misalignment in problem definition leads to incorrect conclusions and wasted effort.
The dataset includes extensive company details, but not all fields are necessary for deduplication.
The key challenge is to identify and leverage the most relevant attributes to accurately detect and group duplicate records.
Take the time to research and understand what defines a company and which attributes uniquely identify it. This understanding is crucial for accurately detecting and grouping duplicate records.
At times, incomplete data may require you to make decisions where there is no clear right or wrong choice. What matters is backing each decision with the reasoning behind it.
It’s essential to document your decisions and the reasoning behind them.
From a tech stack perspective, you can use any programming language, toolset or libraries you’re comfortable with or find necessary, especially if you know it would be a better option or a more interesting one (we generally prefer Scala, Java, Python).
At Veridion, we run similar algorithms on billions of records. While your solution doesn’t need to scale to that level, it would be impressive if it does. For now, however, what matters most is your approach to solving the problem—if your solution is exceptional for the given dataset, we trust that you can scale it effectively using the right tools.

Resources

If you’re ready to jump into the problem, please start with the following file:

entity_resolution

Expected Deliverables

Solution explanation / presentation
Provide an explanation or presentation of your solution and results. You have total creative freedom here—feel free to impress with your thinking process, the paths you took or decided not to take, the reasoning behind your decisions and what led to your approach.
Output
Return the updated dataset where you have correctly identified unique companies and grouped duplicate records accordingly.
Code and Logic
Include the code that enabled you to achieve the required entity resolution for the provided list.

Submit your project

When you’re finished with the challenge, please submit the link to your Github project below.

#4 Product Deduplication

Task

The goal is to consolidate duplicates into a single, enriched entry per product, maximizing available information while ensuring uniqueness.

Context

The dataset contains product details extracted from various web pages using LLMs, resulting in duplicate entries where the same product appears across different sources. Each row represents partial attributes of a product.

Guidelines

Take the time to deeply understand the problem before writing code. Even the most sophisticated solution is ineffective if it solves the wrong problem. Misalignment in problem definition leads to incorrect conclusions and wasted effort.
Thoroughly analyze the dataset to understand each attribute clearly.
There isn’t always a single solution to this problem. Some decisions may be neither strictly right nor wrong, but they should be supported by as many relevant factors as possible.
It’s essential to document your decisions and the reasoning behind them.
From a tech stack perspective, you can use any programming language, toolset or libraries you’re comfortable with or find necessary, especially if you know it would be a better option or a more interesting one (we generally prefer Scala, Java, Python).
At Veridion, we run similar algorithms on billions of records. While your solution doesn’t need to scale to that level, it would be impressive if it does. For now, however, what matters most is your approach to solving the problem—if your solution is exceptional for the given dataset, we trust that you can scale it effectively using the right tools.

Resources

If you’re ready to jump into the problem, please start with the following file:

product deduplication

Expected Deliverables

Solution explanation / presentation
Provide an explanation or presentation of your solution and results. You have total creative freedom here—feel free to impress with your thinking process, the paths you took or decided not to take, the reasoning behind your decisions and what led to your approach.
Output
Return the updated dataset where you have correctly consolidated duplicates into a single, enriched entry per product, maximizing available information while ensuring uniqueness.
Code and Logic
Include the code that enabled you to achieve the required product deduplication for the provided list.

Submit your project

When you’re finished with the challenge, please submit the link to your Github project below.

#5 HTML Clones

Task

Design an algorithm that will group together HTML documents which are similar from the perspective of a user who opens them in a web browser.

Guidelines

The dataset contains 4 subdirectories in increasing complexity.
Your algorithm should work reasonably well on all the “datasets” with no specific optimizations / implementations for one of them.
Take the time to deeply understand the problem before writing code. Even the most sophisticated solution is ineffective if it solves the wrong problem. Misalignment in problem definition leads to incorrect conclusions and wasted effort.
Explore this from as many different angles as you can. It will generate valuable questions.
From a tech stack perspective, you can use any programming language, toolset or libraries you’re comfortable with or find necessary, especially if you know it would be a better option or a more interesting one (we generally prefer Node, Python, Scala).
At Veridion, we run similar algorithms on billions of records. While your solution doesn’t need to scale to that level, it would be impressive if it does. For now, however, what matters most is your approach to solving the problem—if your solution is exceptional for the given dataset, we trust that you can scale it effectively using the right tools.

Resources

If you’re ready to jump into the problem, please start with the following list of company websites:

clones_list

Expected Deliverables

Solution explanation / presentation
Provide an explanation or presentation of your solution and results. You have total creative freedom here—feel free to impress with your thinking process, the paths you took or decided not to take, the reasoning behind your decisions and what led to your approach.
Output
Your program should take one subdirectory at a time and output the grouped documents, something along the lines of: [A.html, B.html], [C.html], [D.html, E.html, F.html] … .
Code and Logic
Include the code that enabled you to achieve this task for the provided list, and ideally, for any list of any size.

Submit your project

When you’re finished with the challenge, please submit the link to your Github project below.

Can’t wait to see what you come up with. Enjoy the process—we’re confident you’ll kick ass!

Download a Veridion data sample

Choose your sample

Generic

Procurement

Insurance

ESG

Market Intelligence

Download a Veridion generic sample

Your data sample is on its way!

Generic

Choose ->

Download a Veridion procurement sample

Check your email for more details. You will be redirected soon...

Supplier Sourcing

Choose ->

Supplier Risk Monitoring

Choose ->

Supplier Enrichment

Choose ->

Download a Veridion insurance sample

Check your email for more details. You will be redirected soon...

Book Management

Choose ->

Pre-fill

Choose ->

Quote to Bind

Choose ->

Download a Veridion ESG sample

Check your email for more details. You will be redirected soon...

ESG

Choose ->

Download a Veridion Market Intelligence sample

Check your email for more details. You will be redirected soon...

Market Intelligence

Choose ->

Get a 100% custom sample from Veridion

Or

Get a 100% custom sample,

tailored to your specific needs

Our clients face a wide variety of data problems before engaging with us. Some focus on classifications, some on locations, business activity or so on. We're happy to tailor our sample so you can paint a picture of fit.

Stay in the loop

By signing up, you agree to the privacy policy

Verticals

Products

Where to buy

(listed as Soleadify)

5 Biggest Goals of Supplier Relationship Management

Learn

The Company

Engineering Challenges

👋 Welcome!

You’ve landed here because we’re considering your profile for our team.

The way we work

The thought process behind our problem-solving approach

Solve hard, useful, and unresolved problems

Creativity over convention

Prioritize solving the problem

Deliver real-world value

Exponential growth

What makes a great solution

Here’s what we expect from a remarkable project:

Correctness

Robustness

Code quality

Extra mile

Presentation

The builder mindset

Choose your track

Task

Context

Guidelines

Resources

Expected Deliverables

Submit your project

Task

Objectives

Guidelines

Resources

Expected Deliverables

Submit your project